Geosemantics

from Wikipedia, the free encyclopedia

Geosemantics (in English the term geospatial semantics is common) is an interdisciplinary field of research and deals with the meaning of geoinformation .

The vision of the virtual globe

Basics

Factors such as the intended application or knowledge of the author influence the interpretation of spatial data. Integration beyond the boundaries of different subject areas and applications requires that divergent interpretations can be excluded. This can be achieved, among other things, with explicit descriptions of the meaning (the semantics ). The development of strategies, computational methods and tools to achieve this semantic interoperability are goals of geosemantics. Long-term goals include improving the usability and methods of retrieving geographic information. Geosemantics is based on knowledge from geoinformatics , geography , philosophy , cognitive science , linguistics , mathematics and computer science .

Al Gore's vision of a virtual globe describes the seamless integration of geographic information, such as digital maps , historical documents and tourist sights, with the help of a virtual globe. By relating this data, for example, topics such as climate change can be visualized and made understandable to the layperson. However, this requires a seamless interplay of data of various origins and quality . By making different perspectives of the world explicit and developing methods for their integration and translation, geosemantics builds bridges between different domains.

Sources of misunderstanding

Humorous place-name sign in New Cuyama. Mathematically, the result is "correct". The sum of thematic, spatial and temporal information has no meaning.

Communicating facts about the environment often leads to misunderstandings. The same environment can be understood and described very differently depending on occupation, cultural background, language, belief, age and purpose. A road can be viewed from different angles: for a road builder, the road as a connection between cities increases the mobility of the population; for an ecologist, it is more of an obstacle for animals, which cuts up nature reserves . This results in different perspectives on the environment.

A particular semantic problem arises when integrating spatial data from different sources. Both rail and road networks can be understood as transport networks . If a semantic description is missing, the two data sets can be combined, but route planning for cars then delivers useless results. Since spatial data form the basis for numerous important decisions, the original meaning of the information must be preserved when using this data. Sustainable transport planning requires that data not only about traffic and ecology, but also about demographics or land use , for example, are taken into account and successfully integrated.

When integrating data from different sources in a common application, the heterogeneity must be considered at all levels. Differences in the syntax can be traced back to various file formats and encodings . Semantic conflicts can be justified by different conceptions of a field of knowledge, different understandings of the same term or a different focus in the data acquisition. The town sign of New Cuyama shown shows the potential effect of a lack of semantic interoperability. Three numbers were added mathematically correctly - syntactic interoperability is guaranteed in this case by the formally defined number system. The actual meaning (population, height above sea level and year of foundation) of these three numbers, however, prohibits a summation.

Semantic interoperability mainly relates to thematic aspects of spatial data. There is no formal reference system required for translations, as is the case with the spatial and temporal dimension. In addition, there is no lingua franca or a common vocabulary as is common in medicine. A prerequisite for correct understanding are therefore explicit and unambiguous specifications of the semantics.

On the basis of explicit and machine-readable descriptions, geosemantics offers an approach to the problems mentioned above. Based on these descriptions, software agents can identify similarities and differences between different perspectives and provide solutions to ensure semantic interoperability. In a service-oriented architecture , syntactic interoperability is an indispensable requirement for the seamless exchange of data between services. Geosemantics methods can then enable a correct interpretation of a search query by the data recipient as desired by the data provider.

... and possible effects

Geosemantic problems often lead to minor and major difficulties for experts and laypeople. These can be avoided if providers of geographic information can describe what is meant and users can evaluate these descriptions.

Where is Sydney located?
Several travelers discovered that place names are rarely unique when they landed in Canada instead of Australia. The airline's booking system failed to recognize the ambiguity of “Sydney” in flight planning.
What does distance mean?
On May 3, 2003, the renowned magazine The Economist depicted the ranges of North Korean missiles using concentric circles on a world map, which were also much too small. In the map projection used , however, lines at the same distance from a point on the earth are not circles, but rather complex curves. In a corrected map that the Economist published two weeks later under the heading "Flat-earth thinking", entire continents that were previously outside the range of long-range missiles (Africa, Europe, North America) were suddenly inside.
What do units of measure mean?
In 1999, the Mars probe Climate Orbiter turned into too low an orbit around the planet and burned up there because a participating company interpreted the numerical value for a braking maneuver, which was given in the metric unit Newton -seconds, in pound- seconds.
The Greek mathematician, geographer and astronomer Eratosthenes greatly improved the measurement of the circumference of the earth . However, he expressed his result in terms of the length unit of stages . Since there were several units of this name in antiquity, the accuracy of its measurement remains uncertain.
What does street width mean?
Some street data collections define street width as the width of the lane, others as the width of the clearance profile . When calculating sealed areas or routes for heavy haulage with excess width, such deviations lead to considerable differences.
What is an accommodation?
With the simple query “find all accommodation in Amsterdam” to a search engine like Google, it must be clarified what is meant by “accommodation”. B. a botel or a campsite with bungalows? Depending on the interpretation used, the search result can be very different or lead to incomplete results.

application areas

Solving geosemantic problems plays a role in many applications. These range from geographic information and navigation systems to virtual globes and geodata infrastructures through to sensor networks . These applications can be roughly divided into four areas according to the type of semantic problems that occur: semantic interoperability and integration, semantic-based search for geographic information, usability and checking of data integrity.

Semantic interoperability and integration

Road in forest.jpg
Benkid77 Roman Road 2 200709.JPG
Different understandings of the term street can lead to semantic conflicts.

Web services and other software components must be able to use and pass on information without leading to misunderstandings. There are two basic types of such interoperability to be distinguished - syntactic and semantic interoperability. In this way, two services can agree that they describe their data in a certain format. However, this syntactic agreement cannot prevent one service from interpreting the data of the other differently. For example, a data provider can classify a traffic link as a road as long as it has a minimum width, while the road surface can be critical for a particular query. Such a lack of semantic interoperability can lead to tourist buses getting stuck in the mud.

Find and merge web services

In order to be able to find and merge web services automatically, they must be described semantically. One example of an implementation of semantic services is the European research project SWING (Semantic Web-service Interoperability for Geospatial decision making). The aim was to semantically annotate web services in such a way that they can be meaningfully combined and used for processing geographic objects (e.g. storage facilities). This required an agreement on a common semantic model of application in the form of an application ontology (see section Construction of formal theories about the environment ), in the development of which application experts participated. Meetings between these experts and ontology engineers made it possible to consider content-related concerns of the users without them having to be familiar with the data sources or processing mechanisms.

Semantic sensor web

The semantic sensor web is a new research topic in geosemantics. It deals with the semi-automatic integration of sensors and sensor data in spatial data infrastructures . Results from the Sensor Web Enablement (SWE) are combined with the possibilities of the Semantic Web. The aim of the Semantic Sensor Web is to make observation results accessible on the Internet instead of isolating them in data silos. The aim is to give scientists a better understanding of environmental phenomena. In the long term, semantic sensor services should allow sensors to be selected, configured and coordinated with one another automatically at the request of a user. For example, the system should understand the query "Will there be a flood in section XY of the Danube if it continues to rain for the next seven hours?" To do this, it has to know the terms “river section”, “Danube” and especially “flooding” and the relationship between rain and flooding. It can use Sensor Planning Services (SPS) and Sensor Alerting Services (SAS) to answer the request.

Semantics-based search for geographic information

Information retrieval methods deal with the search for relevant information on the basis of a specific information need. This includes the indexing of data, the relevance assessment of results and the calculation of quality measures such as hit rate and accuracy . Geographic information retrieval extends the classic problem to include spatial and temporal components. An exemplary search for “pubs in Vienna's old town” requires both thematic and topological correspondence between the query and the search results. Understanding the data sources and queries requires methods in the following areas:

  • The determination of geographical references in the text as a special case of computational linguistics .
  • The resolution of ambiguities in place names into an explicit geographic reference.
  • Vague geographical terminology describes the challenge of dealing with geographical references to (often colloquial) place names with undefined boundaries, for example Vienna's old town.
  • Spatial and textual indexing of the spatiotemporal component of geographic information in addition to text-based indexing.
  • Geographical relevance assessments add a spatio-temporal component to typical relevance measures. An evaluation is therefore not only based on the text-based agreement, but also, for example, on the distance to the city center.

Methods for the semantics-based acquisition of geographic information require semantically annotated data sources. As a result, these methods combine classic information acquisition with deductive and inductive locking methods , especially spatial locking. One applications are, for example, similarity-based user interfaces for working with local dictionaries.

Gazetteers

See main article: Ortlexikon .

Adlinterface.png
[[File: | 200px]]


The classic ADL Gazetteer user interface compared to a semantics-based version, which suggests generic terms and similar terms to make navigation easier.

Gazetteers are gazetteers that connect place names with a geographic region (e.g. via their bounding box ) and an object type (e.g. city). While gazetteers have been created for different purposes, most gazetteers offer at least two basic functions: one to return the geographic reference associated with a name and one to return the object type. The object types come from informal thesauri with natural language descriptions. However, some gazetteers also use more formal knowledge representations and ontologies . Gazetteers appear as components in many applications, e.g. B. in web mapping systems, spatial search engines and geoparsers . Inquiries such as hotels in Vienna are processed by deriving the associated geographical reference (which can also be just a circumscribing rectangle) from the place name Vienna . Then they search for all hotels within this region. To interpret ambiguous place names like Vienna, you can use the user's IP address, for example . a The above methods for a better understanding of the search queries also apply in particular to gazetteers. Well-known gazetteers used worldwide are for example the Getty Thesaurus of Geographic Names or Geonames.org .

usability

Geographical Information Systems (GIS) offer operators for data analysis, e.g. B. to formulate queries based on topological relationships ("Which buildings are in the floodplain?") Or to calculate metric properties ("How far is my house from the floodplain?"). The meaning intended by the system developer, the actually implemented, the documented and the meaning expected by the user of such operators may differ from one another. However, effective methods to describe them are still lacking today. Graphics with examples (such as the following for the spatial overlap operator in a GIS) are often unclear or ambiguous.

This example graphic describes an overlap with a thin frame and differently colored objects. The relationship between an individual object and this framework remains unclear.

With machine-readable operator semantics, GIS would also be able to offer the user only the operators that are permitted for selected objects. For example, the representation of equidistant lines on the earth's surface by concentric circles in a Mercator projection could be avoided, as was the case in the example above . When GIS functions are called up via web services and when they are automatically linked, such semantic-based selection criteria become indispensable.

Check of data integrity

An essential aspect of the quality of geographic information is the integrity of the individual objects. Geosemantics can be used to automatically check whether the data provides a consistent image of reality. The meaningfulness of the attribute values ​​and spatial relation to other objects are described with the help of the logical and topological consistency (described in more detail in ISO standard 19113 ). Logical inconsistency affects individual objects, especially the values ​​of their attributes. For example, the width of an object classified as a street must not be less than three meters, otherwise this object is inconsistent because it is incorrectly marked as a street. Topological inconsistency refers to well-defined relationships between objects. The object representing a motorway exit must necessarily be connected to the motorway and another road object. Furthermore, a road object must not intersect with a lake object if the data model also allows bridge objects. However, rules can not only check for inconsistencies. In the latter case, the system can also suggest alternative object types such as “bridge” or “ferry connection” for the street object and ultimately support the user in describing them.

Semantic strategies

A large number of information communities create and use spatial data. These communities are formed around tasks such as land registry administration or traffic, environmental and resource management. Semantic strategies help to establish these information communities and to make the intended use of the data explicit. Ultimately, they aim to enable successful communication, in line with Paul Grice's principle of cooperation : the communication participants tacitly assume the meaning of certain expressions to one another. However, if the participants have different expectations or come from different cultural or scientific contexts, special methods are required to coordinate the respective interpretations. These methods have to restrict the possible interpretations of the statements so that only the desired one remains. For example, an ecologist needs to know the intended purpose of a mapping agency's road data in order to use it for his own purposes. In general, spatial data users must be able to interpret their data in a way that is compatible with the intentions of the provider.

Quality standards: what is to be achieved

How high does an elevation have to be to be called a mountain?

The available semantic strategies can be assessed according to what kind of agreement they enable between the communication participants. Each interpretation links concepts to the underlying ideas and a particular observable context. In the case of geosemantics, the observable context is the human environment. Three types of agreement can be derived from this:

  • A terminological agreement on basic terms to be used in definitions, such as the name of a street and the “part-whole” relationship between habitats.
  • An ontological commitment that defines the objects that are being communicated. For example, that there are certain types of roads, that they have a recognizable width, and that they are always part of a road network.
  • The definition of the context to ensure that expressions are only used within the boundaries of a particular domain or area of ​​application, thus avoiding possible ambiguities. For example, that road width refers to the paved road surface, but excludes cycle paths.

The film " The Englishman who climbed a hill and came down from a mountain " shows a clear example of different understandings of the same terms and the resulting consequences . The inhabitants of the fictional Welsh village of Ffynnon Garw understood something completely different about the term “mountain” than two English surveyors. The two defined "mountain" from the standpoint of national elevation distribution as an elevation greater than 1,000 feet; however, the inhabitants as a locally dominant census. The two parties both used the same terms “mountain”, “hill”, “height” and certainly agreed that landscapes should be classified according to height or that hills are smaller than mountains. But the context in which they used the terms “mountain” and “hill” remained ambiguous. When the residents realized that the national context outside of Wales was important and that their mountain would become a hill, they erected a cairn on top of their Garth Hill to adapt it to national classification requirements. You could of course just as well have accepted two different contexts and called the elevation locally a mountain and nationally a hill. In a very similar way, the engineers of the Mars Climate Orbiter had failed to make the context of their length measurements explicit, which would have required the use of length standards and appropriate conversion rules.

Semantic Engineering: How to Reach Agreement

As shown above, successful communication does not require agreeing on all terms within an information community. It is sufficient to agree on a few defining basic terms, on ontological stipulations regarding the use of these terms and on contexts. Geosemantics can be understood as an engineering science that supports just such a unification process. This task requires methods and tools to specifically limit interpretations. Such methods can be understood as a combination of three strategies:

Construction of formal theories about the environment

A successful strategy is to specify a formal theory that describes a particular view of the environment. Such a theory is called ontology in computer science to emphasize that it deals with existing objects in the world, similar to the idea of ontology as a philosophical discipline. As a logical theory that describes the intended meaning of a formal vocabulary, it can be used to establish terminology and communicate ontological definitions. In a “mountain” ontology, the Welsh residents and the English surveyors could have agreed on terms and their application instructions for visible land surveys. The question of whether an elevation is a mountain or not could then have been answered by formal closing , provided that the parties had previously agreed on certain height measurements. The approach of considering geoontologies as theories of common spatial sense, similar to “naive physics”, is described here.

Independent semantics (emergent semantics)

In the case of independently developing semantics, the point is to negotiate semantics. A certain amount of coordination is necessary in all strategies, but the independently emerging semantics provide tools to enable geodata users to negotiate meaning in a joint process using folksonomies . Folksonomies are collectively managed, hierarchy-free collections of keywords, called tags, which are linked to information sources. For example, a Delicious user can tag her website describing a bike tour around Lake Constance with the tags “Bicycle”, “Excursion”, “Austria”, “Bregenz” and “Lake Constance”. In this case, the user defines a category for her website by linking the term directly with the description of the excursion. Compared to classic ontologies, the consensus is not enforced from top to bottom (such as from the national level to the village community in the film), but from bottom to top - users agree among themselves.

Semantic referencing

The aim is to clearly separate different contexts from one another. This third goal of semantic strategies cannot be achieved by means of ontologies and folksonomies alone, since formal theories and tags are unable to clearly define the desired meaning. In particular the symbols used can not refer to their speakers in the real world, as a symbol anchoring problem (symbol grounding problem) is known. However, geospatial data often has to be interpreted very narrowly in terms of very specific observations: For example, “Garth Hill” only refers to that Welsh mountain; the border of Germany on an agreed dividing line of national territories; the term "1 meter" refers to a number of physical phenomena, including a platinum bar . Analogous to temporal or spatial reference systems , semantic reference systems allow terminology to be anchored in comprehensible observations and thus to define a “semantic date”. Usually, exact location determinations are given with the help of a geodetic reference ellipsoid , i. H. a mathematical frame of reference, which is represented by a geodetic datum , i.e. H. a fixed location on the surface of the earth, a standard orientation and a standard position in which the perceptible environment is anchored. Semantic reference systems are in fact generalizations of such reference systems. An example of such a semantic approach can be found here.

Methods of geosemantics

The previous sections deal with the justification of geosemantics. Established methods of geosemantics are described in detail here. The selection of the methods is based on the fields of application described above.

Methods for the recovery of geographic information

The search for geodata requires the formulation of search queries with georeferences , the indexing of georeferenced documents and geodata, relevance assessment adapted to geographic information and the support of the user in evaluating the search results. All steps are supported by methods of geosemantics.

Finding information
A commonly used method is to help users refine and expand queries by suggesting concepts for ontologies or unique place names from gazetteers. The search engine SPIRIT is an example of a semantically supported query extension for geographic information acquisition. A refined method is based on explicit semantic queries, which are represented in a rule-like format and with a vocabulary derived from ontologies and local directories. Queries based on Semantic Web Rule Language (SWRL) can support spatial closing and may provide better results if extensions for spatial functions are integrated.
indexing
Indexing algorithms for the categorization of georeferenced documents are partly based on external knowledge databases. Traditional search engines rely on indexing text (supported by linguistic processes such as stemming or lemmatization ). Data mining techniques such as latent semantic indexing identify key concepts and place names within documents, which are then automatically linked to spatial ontologies via semantic annotation.
rating
Based on the concepts of the query and their correspondence with the indexed documents, an information retrieval system can determine the relevance of the document to the query. The spatial distance of the georeferences can represent an additional evaluation criterion.

Methods for semantics-based automatic data integration

In information retrieval, users should be supported in finding relevant information and served with the most relevant documents as a result. The user evaluates the results and selects the most suitable hits. Data integration releases the user from this evaluation step. In this case, the system should automatically recognize the required documents and other data sources and incorporate them into existing work processes. When retrieving information, the semantics help the user formulate search queries and understand the results. In the case of data integration, this is the task of the reasoner. Various methods based on WSMX were tested in the SWING project.

Methods for checking the consistency of spatial data

Errors in the data collection process can result in disconnected rivers or trees within lakes. Reasoning algorithms can ensure the logical and topological consistency of data when individual geographical objects are to be linked with concepts within ontologies. Semantic rules can explicitly represent such contradictions. Reasoners execute these rules and identify errors.

Methods for creating geoontologies

The creation of geoontologies is a time-consuming undertaking that can be supported by various methods.

The acquisition of knowledge for the construction of knowledge models in geosemantics is similar to that in other disciplines. A special feature of domains such as geology , geography or hydrology is that they are anchored in reality. A distinction is made between top-down and bottom-up strategies. Most of the time, however, a combination of the two is used. An integration of an already established vocabulary can guarantee the reusability of the ontologies.

Top-down cooperation with experts
A literature search in the relevant sub-area helps knowledge engineers to define the desired scope of the ontology and to identify key terms of the subject. This is ideally accompanied by data mining techniques. The existence of tacit knowledge as well as frequently occurring inconsistencies between definitions require the cooperation with experts in the respective scientific field in order to clearly explain the often complex meaning of the terms
Bottom-up - knowledge acquisition as a common service
Examples like Wikipedia show that sometimes an information community is itself the best source for building explicit knowledge about a domain. OpenStreetMap , a collaborative attempt to build a map system, asks the editor to categorize individual objects. The category names can be freely chosen by the user, even if the use of existing terms is desired. This vocabulary is a valuable source for ontologies.
Identification and reuse of existing models
Semantic search engines like Swoogle already allow searching of existing ontologies. One example from geography is the SWEET ontology. The reuse of individual concepts from already existing and established thesauri such as the General Multilingual Environmental Thesaurus GEMET or the AGROVOC of the FAO ensure broad acceptance beyond application limits.

Inference methods for geoontologies

Geoontologies can be used to draw logical conclusions from modeled facts of the perceptible environment and to derive knowledge from geographic data, or to add it by means of deductive inference . A frequently used approach from the Semantic Web is to use decidable sub-logics of the first-level predicate logic such as description logic (DL) in order to be able to deduce logically from taxonomies. For example, to decide whether a class contains another (subsumption), whether two classes share individuals, whether classes are empty, or to find all instances of a class. An important method is the automatic test for consistency of a theory in this calculus . Special tableau and resolution based automatic provers have been developed for DL ​​languages ​​such as the Web Ontology Language OWL .

Typical tasks are: test for satisfiability, subsumption, instance test, finding the smallest common superclass and the most specific class, as well as similarity-based reasoning . Some tasks, such as rule-based closing and querying , e.g. B. with SWRL, can be solved comparatively efficiently with the help of Horn-Clause -based methods, i. H. with forward or backward chaining . Many spatial theories, however, for example "Region Connection Calculus", Mereology or Euclidean Geometries , require the full expressiveness of the first or even second order predicate logic (see "Region Based Geometry"), and therefore require semi-automatic inference procedures such as resolution or natural reasoning . Often in geosemantics precalculated “composition tables” or conceptual neighborhood graphs are used for proof, for example in the 9-section model of the regional topology. A common approach is to extract task-specific and more efficiently computable partial theories from an expressive but undecidable basic theory, e.g. B. in the form of Horn clauses or description logics.

Tools for geosemantics

Tools and standards in the field of geosemantics are largely the same as those used for the Semantic Web . These include W3C standards such as XML , XML Schema , RDF , RDF Schema , OWL or the Semantic Web Rule Language ( SWRL ). Ontologies based on these standards enable the specification of the concepts and relations that are required for the techniques mentioned.

  • Ontology editors enable information scientists to specify domain knowledge in the standards mentioned. The best-known example of manual processing is Protégé .
  • Tools specially designed for spatial ontologies are ConceptVista, or Rabbit, a language that is used to edit ontologies and was developed by Ordnance Survey . Most editors only support a limited number of standards for knowledge representation. Since the standards differ in their expressiveness, the specific choice of an editor depends not least on how expressively you want to work.
  • In the first phase of ontology modeling, on the other hand, less specialized but easier-to-use editors can be used, which are also mostly independent of ontology standards. Concept maps facilitate access to knowledge models and collaborative work on an ontology, an example of this is CMapTools (cmap.ihmc.us) .
  • Semi-automatic tools such as Reuters Calais or Ontogen ontogen.ijs.si support the detection of core concepts by means of data mining . In the first development phase, they are used to determine the scope and focus of the knowledge area.
  • In order to check the consistency of existing data, rules are used that can be expressed in SWRL, for example. In developments such as GeoSWRL semwebcentral.org , spatial relations are integrated into SWRL.
  • Even in connection with comprehensive sets of rules, ontologies alone cannot implement semantic applications. Reasoners such as Pellet clarkparsia.com or Jena are necessary , which enable machine conclusions and thus make applications of geosemantics realizable. SIM-DL sim-dl.sourceforge.net is a reasoner specially designed for geo-semantics , which is based on description logic and enables similarity measurements to be carried out.

History of geosemantics

The study of geosemantics is based on results from many disciplines. This brief outline of the history of geosemantics focuses on contributions from the last three decades that deal specifically with thematic aspects of spatial data. The story can be divided into four phases, each of which is characterized by a special locking method and forms of representation of the semantics.

From digital maps to spatial closure (until 1990)

The transition from paper-based to digital geospatial data in the 1960s initially led to the assumption that geographic information systems were primarily tools for storing and changing digital maps. It took two decades, shaped by advances in database research, until the difference between graphic signs on a map and the spatial and thematic information represented by them became clear. At the same time, scientists in artificial intelligence and computer science began to work on locking methods and problems of the representation of spatial data. Representatives of these two groups as well as geographers, linguists, philosophers and cognitive scientists came together in July 1990 in the influential two-week “Las Navas Meeting” to discuss cognitive and linguistic aspects of geographical space and to publish the status quo in a groundbreaking book.

Geographical representation and locking procedure (from around 1990 to 2000)

The examination of the importance of geo-referenced data and their application in locking procedures has now become a central topic in geoinformatics, cognitive science and artificial intelligence. The US National Science Foundation (NSF) funded the "National Center for Geographic Information and Analysis" (NCGIA) as a consortium consisting of the University of California, Santa Barbara , the State University of New York in Buffalo and the University of Maine . This recognized that geographic space poses particular challenges in terms of representation and analysis. NCGIA's research program was shaped by five core topics with a strong reference to semantics. During this time, the COSIT (Conference series on Spatial Information Theory) and GIScience (Conference series on Geographic Information Science) conference series were founded. In addition to the modeling of cognitive and linguistic aspects, cultural aspects and differences now also became the focus of interest.

Distribution of geospatial data on the World Wide Web (from around 1995)

In the meantime, industry and authorities had recognized that attempts to standardize the vocabulary (as in the Official Topographic-Cartographic Information System ) limit the geodata market to users with similar conceptualizations, as long as they are not accompanied by semantic translation. The emerging mass market for navigation systems simultaneously confronted men and women on the street with semantic problems. The Open GIS Consortium (OGC, today Open Geospatial Consortium ) was founded under the banner of interoperability. Providers of existing systems should not revise their underlying data models, but should implement open interfaces to enable communication across technical and semantic boundaries. The OGC coined the term information communities and responded early by founding a working group on semantics. Nevertheless, it still focuses on syntactic interoperability and the connection of spatial data with general information technology and leaves semantic questions to the users. In close cooperation with the OGC, however, an expert meeting and two conferences on interoperability were organized at an early stage. In addition, research began to focus on issues relating to the publication and use of geosemantics in service -oriented architectures .

Dissemination of geographic information on the interactive web (from around 2005)

The emergence and widespread acceptance of interactive forms of communication in the first years of the new millennium resulted in the social network of Web 2.0 . At this new level of the Internet, geographic information is not only consumed by users but also actively contributed (e.g. through GPS data and photos). This changes the whole geoinformatics and with it the geosemantics. The traditional hierarchical approach of spatial data production and the associated hierarchically controlled semantics is now being supplemented in many application areas by a broad database from users. The impressive growth of OpenStreetMap shows this development, but its semantic challenges. The greatest potential gain of such crowdsourced information for geosemantics lies in the analysis of different interpretations of terms in the form of tags . Such empirical data are of the highest value for geosemantics and the research questions posed by them will be a focus of geosemantics research in the coming years.

Open research questions

In a certain sense, the whole of geosemantics is still an open field of research. As can be seen from the above strategies, methods and tools, however, there is a steadily growing foundation on which current and future research questions are based. Some of them are briefly outlined below to provide an insight into the current state of research.

Semantics of processes

Geosemantics deals with phenomena in the environment. In most cases, however, these are not static, but processes or events. For example, understanding measurement results from sensors requires understanding the processes through which these results are obtained. On the one hand, these are those that convert the observation in the real world into an electronic signal, as well as the processes that led to the observation. Understanding an anemometer required understanding the pressure-dependent circulation of air masses. This is much more true of complex processes like climate change . In order to be able to assess whether the climate is changing significantly, one must understand the underlying processes and their effects. While research on this is currently content with static models, geosemantics will therefore increasingly deal with the formalization of such processes and their observability.

vagueness

The use of terms in the geosciences is often vague and, in contrast to disciplines such as bioinformatics, it is difficult to agree on canonical definitions. For example, one can say that a healthy human hand that conforms to the norm has five fingers; this is only possible to a limited extent for the definition of flow. The problem is even more complex. On the one hand, terms cannot be defined without context or domain. On the other hand, the formal methods used, such as ontologies, only allow a restriction of the possible interpretations without being able to completely exclude undesired ones. The future research questions can be divided into two areas. On the one hand, methods must increasingly be developed that allow the effective use of vague terms; on the other hand, geosemantics must face the challenge of semantic translation in the future.

Semantic translation

Different conceptions of the concept of the road

The core idea of ​​semantic translation (or a semi-automatic semantic translator) is not to achieve an all-encompassing agreement on the definition of terms, but to allow heterogeneity. There are good reasons why individual domains within the geosciences have different views about the same things in the world. The street example described in the introduction shows this impressively. Roads can be seen both as connections between places and as their opposite, namely as obstacles for animals that cut up a habitat. Both views are incompatible and still make sense for your use cases. With the help of semantic translation, semantic alignment methods and similarity measures, however, bridges between these worldviews can be built. These bridges are not only used to exchange data, but also help experts to find consensus.

trustworthiness

In the vision of the Semantic Web , the problem of the trustworthiness of information and inferences forms the top layer of the Semantic Web Layer Cake . There are many possible interpretations of the term trust or trustworthiness . In geosemantics, there is currently a tendency to define trustworthiness as a measure of information quality (in the sense of the usefulness of data for a specific task). This goes far beyond classic approaches to data quality, which usually deal with issues of consistency or completeness. The underlying idea is that trustworthy users will continue to contribute qualitatively better results to an information community or a project such as OpenStreetMap in the future than users whose previous contributions have turned out to be imprecise.

Evolution of semantics

Conceptualizations and the use of language change over time. This is not yet captured by today's ontologies and not yet exploited by folksonomies . Many applications in the field of geographic information must be able to deal with changing semantics, for example with changing names of place names or revised classifications. Temporal indexing of ontologies would be the simplest approach, but is typically neither performed nor used for inference. Further approaches for models of semantic development are largely out of reach with the currently available techniques.

Weblinks to research groups dealing with geosemantics

(In alphabetical order)

Further literature

  • Al Gore : The Digital Earth: Understanding our planet in the 21st Century. ( isde5.org ( Memento from June 20, 2009 in the Internet Archive ))
  • F. Fonseca, MA Rodriguez, S. Levashkin (Eds.): GeoSpatial Semantics, Second International Conference, GeoS 2007, Mexico City, Mexico, November 29-30 2007. (= Lecture Notes in Computer Science. 4853). 2007, ISBN 978-3-540-76875-3 .
  • K. Janowicz, C. Keßler, M. Schwarz, M. Wilkes, I. Panov, M. Espeter, B. Bäumer: Algorithm, Implementation and Application of the SIM-DL Similarity Server. Second International Conference on GeoSpatial Semantics (GeoS 2007). Mexico City, Mexico, November 29-30, 2007 (= Lecture Notes in Computer Science ). Springer 2007, pp. 128-145. (ifgi.uni-muenster.de , PDF; 766 kB)
  • K. Janowicz, S. Scheider, T. Pehle, G. Hart: Geospatial Semantics and Linked Spatiotemporal Data - Past, Present, and Future. In: Semantic Web Journal. 2012. (geog.ucsb.edu , PDF; 274 kB)
  • M. Kavouras, M. Kokla: Theories of Geographic Concepts: Ontological Approaches to Semantic Integration. 1st edition. CRC Press, 2007, ISBN 978-0-8493-3089-6 .
  • W. Kuhn: Semantic Reference Systems. In: International Journal of Geographical Information Science, Guest Editorial. 17 (5), 2003, pp. 405-409. (tandfonline.com)
  • W. Kuhn: Geospatial Semantics: Why, of What, and How? In: Journal on Data Semantics III. (= Lecture Notes in Computer Science. 3534). Springer, Berlin / Heidelberg 2005, ISBN 3-540-26225-3 . (researchgate.net)
  • P. Maué, S. Schade, P. Duchesne: Semantic Annotations in OGC Standards, Open Geospatial Consortium (OGC), July 2009. (portal.opengeospatial.org)
  • H. Pundt: The Semantic Mismatch as Limiting Factor for the Use of Geospatial Information in Disaster Management and Emergency Response. In: S. Zlatanova, J. Li (Ed.): Geospatial Information Technology for Emergency Response. ISPRS Book Series, Taylor & Francis, London / New York 2008, pp. 243-256.
  • MA Rodriguez, IF Cruz, MJ Egenhofer, S. Levashkin (Eds.): GeoSpatial Semantics, First International Conference, GeoS 2005, Mexico City, Mexico, November 29-30, 2005, Proceedings. (= Lecture Notes in Computer Science. 3799). 2005, ISBN 3-540-30288-3 .
  • Shashi Shekhar, Xiong, Hui (Eds.): Geospatial Ontology, Geospatial Semantic Interoperability. In: Encyclopedia of GIS. Springer, 2008, ISBN 978-0-387-30858-6 .

credentials

  1. dpa / heg: Wrong Plane: Started After Sydney, landed in Canada. In: welt.de . August 11, 2009. Retrieved October 7, 2018 .
  2. SWING project: 138.232.65.156/swing/index.html ( Memento from July 1, 2012 in the web archive archive.today )
  3. Amit Sheth, Cory Henson, and Satya Sahoo, Semantic Sensor Web. (PDF; 919 kB). In: IEEE Internet Computing. July / August 2008, pp. 78-83.
  4. ^ AG Cohn, SM Hazarika: Qualitative spatial representation and reasoning: an overview. In: Fundamenta Informaticae. vol. 46, 2001, pp. 1-29.
  5. K. Janowicz, C. Keßler, M. Schwarz, M. Wilkes, I. Panov, M. Espeter, B. Bäumer: Algorithm, Implementation and Application of the SIM-DL Similarity Server. In: Second International Conference on GeoSpatial Semantics (GeoS 2007). Mexico City, Mexico, November 29-30, 2007 (= Lecture Notes in Computer Science. 4853). Springer 2007, ISBN 978-3-540-76875-3 , pp. 128-145.
  6. ^ W. Kuhn: Semantic Engineering. In: G. Navratil (Ed.): Research Trends in Geographic Information Science. (= Lecture Notes in Geoinformation and Cartography ). Springer, Berlin 2009, ISBN 978-3-540-88243-5 , pp. 63-76.
  7. ^ A b N. Guarino: Formal Ontology and Information Systems. In: N. Guarino (Ed.): Formal Ontology in Information Systems, Proceedings of FOIS'98, Trento, Italy, 6-8 June 1998. IOS Press, Amsterdam 1998, pp. 3-15.
  8. ^ A b P. J. Hayes: The Second Naive Physics Manifesto. In: JR Hobbs, RC Moore (Ed.): Formal theories of the Commonsense World. (= Ablex Series in Artificial Intelligence ). Ablex, Norwood, NJ 1985, ISBN 0-89391-213-1 .
  9. MJ Egenhofer, DM Mark: Naive Geography. In: AU Frank, W. Kuhn (Ed.): Spatial Information Theory: A Theoretical Basis for GIS. (= Lecture Notes in Computer Science. 988). Springer, Berlin 1995, ISBN 3-540-60392-1 , pp. 1-15.
  10. B. Smith: Ontology. In: L. Floridi (Ed.): Blackwell Guide to the Philosophy of Computing and Information. Blackwell, Oxford 2003, pp. 155-166.
  11. ^ S. Harnad: The Symbol Grounding Problem. In: Physica D. (42), 1990, pp. 335-346.
  12. ^ Werner Kuhn: Semantic Reference Systems. In: International Journal of Geographical Information Science. 17 (5), 2003, pp. 405-409.
  13. ^ S. Scheider, K. Janowicz, W. Kuhn: Grounding Geographic Categories in the Meaningful Environment. In: KS Hornsby, C. Claramunt, G. Ligozat (Eds.): Spatial Information Theory, 9th International Conference, COSIT 2009, Aber Wrac'h, France, September 21-25, 2009. (= Lecture Notes of Computer Science. 5756). Springer, Berlin 2009, pp. 69–87.
  14. CB Jones, AI Abdelmoty, D. Finch, G. Fu, S. Vaid: The spirit spatial search engine: Architecture, ontologies and spatial indexing. MJ Egenhofer, C. Freksa, HJ Miller (Eds.): Lecture Notes in Computer Science. vol. 3234, Springer Berlin / Heidelberg, October 2004. ( geo-spirit.org ( Memento from August 9, 2017 in the Internet Archive ))
  15. SWING project: 138.232.65.156/swing/index.html ( Memento from July 1, 2012 in the web archive archive.today )
  16. WSMX: http://www.w3.org/Submission/WSMX/
  17. ^ S. Schade, P. Maué, J. Langlois, E. Klien: Knowledge acquisition with geologists - a field report. In: ESSI1 Semantic Interoperability, Knowledge and Ontologies, EGU General Assembly 2008. February 2008. (cosis.net)
  18. ^ Swoogle: Archived copy ( Memento of March 7, 2009 in the Internet Archive )
  19. SWEET: Archived copy ( Memento from May 29, 2007 in the Internet Archive )
  20. General Multilingual Environmental Thesaurus: http://www.eionet.europa.eu/gemet
  21. K. Janowicz, M. Wilkes: SIM-DL_A: A Novel Semantic Similarity Measure for Description Logics Reducing Inter-Concept to Inter-Instance Similarity. The 6th Annual European Semantic Web Conference (ESWC2009) . (= Lecture Notes in Computer Science. 5554). Springer, 2009, pp. 353-367. ( Archive link ( Memento from August 18, 2011 in the Internet Archive ))
  22. ^ B. Bennett, AG Cohn, P. Torrini, SM Hazarika: A Foundation for Region-Based Qualitative Geometry. In: W. Horn (Ed.): Proc. 14th European Conf. on Artificial Intelligence. IOS Press, Amsterdam 2000, pp. 204-208.
  23. ( Page no longer available , search in web archives: informatik.uni-bremen.de )@1@ 2Template: Toter Link / www.informatik.uni-bremen.de
  24. geovista.psu.edu
  25. ( page no longer available , search in web archives: ordnancesurvey.co.uk )@1@ 2Template: Dead Link / www.ordnancesurvey.co.uk
  26. ^ Nicholas Chrisman: Exploring Geographical Information Systems. 2nd Edition. Wiley, 2001, ISBN 0-471-31425-0 .
  27. D. Mark, AU Frank: Cognitive and Linguistic Aspects of Geographic Space: Proceedings of the NATO Advanced Study Institute, Las Navas Del Marques, Spain, July 8-20, 1990. Springer, London 1991.
  28. cosit.info
  29. giscience.org
  30. ^ M. Goodchild: Citizens as sensors: the world of volunteered geography. In: GeoJournal. 69 (4), 2007, p. 211-221.
  1. Different maps for the range of North Korean missiles

Remarks

aWhile most gazetteers deal with such simple questions, they fail those that require more information about property types. A simple example of this would be a request for accommodation in Vienna. To do this, a gazetteer must recognize that the search term generalizes several property types and which ones (e.g. hotels, motels, youth hostels, etc.). Similarly, gazetteers cannot handle data of different origins and quality, especially not with user-generated data ( volunteered geographic information ), which in addition to official place names often also use local names (e.g. city center) and small-scale property types (e.g. pubs).