Resource Description Framework

from Wikipedia, the free encyclopedia

The Resource Description Framework ( RDF , Engl. Mutatis mutandis "system for the description of resources") refers to a technical approach on the Internet for the formulation of logical statements about any things (resources). RDF was originally designed by the World Wide Web Consortium (W3C) as the standard for describing metadata . Meanwhile, RDF is considered to be a fundamental component of the Semantic Web . RDF is similar to the classic methods of modeling concepts such as UML class diagrams and the entity relationship model . In the RDF model, every statement consists of the three units subject, predicate and object, whereby a resource is described in more detail as a subject with another resource or a value (literal) as an object. With another resource as a predicate, these three units form a triple (“3- tuple ”). In order to have globally unique identifiers for resources, these are formed according to conventions analogous to URLs . URL for commonly used descriptions, such as B. for metadata, RDF developers are known, and can be used worldwide for the same purpose, which u. a. Enables programs to present the data in a meaningful way for humans.

RDF model

The RDF model is a data model with well-defined formal semantics based on directed graphs . Data in RDF are statements about resources. These statements are modeled as triples . The set of triples forms a (mathematical) graph and is known as the RDF model. The triple in the RDF model is a statement that consists of subject, predicate and object.

example

Basic RDF diagram

The triple represents a statement in which subject and object are related to each other (relation). Relationships are directed from the subject to the object and are named with the predicate. Triples, which refer to the same subjects or objects, form a semantic network that is often represented in tables or graphs. Clearly speaking, every statement in RDF is a simple sentence. About:

"ACME produces batteries"

Transferred to modeling using RDF:

  • Subject = ACME
  • Predicate = produced
  • Object = batteries

In the following example table (supplemented by further statements) each line forms a triple:

subject predicate object
ACME produced Batteries
Batteries contain acid
Batteries contain zinc
ACME is a company

Resource, URI, and Literal

A resource is something that is clearly identified and about which you want to say something. Subject and predicate are always resources. The object can either be a resource or just a literal . Literals are character strings that may still be interpreted using a specified data type. The literals can be e.g. B. Specify truth values, numbers or dates. RDF resources are identified by unique identifiers ( URIs ). The URIs allow statements from different sources to be linked. The resources are usually identified with a URI, which is similar in form to a URL. URLs are special URIs that are used to uniquely identify websites. URIs do not necessarily have to be reachable in the network.

Examples:

  • URI of the website for this article: http://de.wikipedia.org/wiki/Resource_Description_Framework
  • URI of a mail address: mailto:123@example.com
  • URI of a book: urn:isbn:978-3898530194

In turn, statements can be made in RDF about the resources used as predicate and stored as metadata format . Other RDF authors can use these vocabularies by referencing them. A prominent example of this is the representation of Dublin Core in RDF. On the other hand, RDF statements themselves form resources that can be referred to with further statements. This technique of statements about statements is called reification .

In addition, RDF has predefined data types for lists and quantities in order to combine groups of resources. Resources that do not have an explicit URI but only serve to group other objects are usually modeled by so-called "blank nodes". An example of this is the assignment of a name that consists of separate strings for first and last names.

representation

RDF is independent of a special (textual) representation. Usually this is XML and a shorter syntax called notation3 ( N3 ). The W3C has in 2011 also the language Turtle defined, which is a reduced subset of N3 and will thus contribute to a wider dissemination.

There are various concepts ( triplestore ) for storing RDF in databases and data structures , since simply storing the triples in a relational table is not very efficient for many queries.

RDF triples are also represented graphically: according to convention, resources that are the subject or object of a triple are symbolized by ellipses and literals by rectangles. The connection between a subject and an object is represented by a directed edge, which is labeled with the predicate. The following figure follows this convention and shows "http://de.wikipedia.org/wiki/Resource_Description_Framework". The RDF graph shown shows that the resource - in this case this article - has a title called "Resource Description Framework" and a publisher, "Wikipedia". In the example, this is only modeled as a literal and therefore cannot be specified any further.

Example of graph

Interrogate

Various query languages ​​have been designed for searching in RDF data. The form of the RDF Query Language ( RDQL ) is very similar to SQL . In January 2008, the W3C passed SPARQL as a W3C Recommendation , making it the standard for RDF query languages, which is why there are many implementations for SPARQL.

The following description of the current article is given as an example, the title and publisher being defined according to Dublin Core : 'http://de.wikipedia.org/wiki/Resource_Description_Framework' has the title 'Resource Description Framework' and the publisher 'Wikipedia - The Free Encyclopedia '. In RDF ( N3 ) this is expressed using two triples (this hasis only for better readability):

<http://de.wikipedia.org/wiki/Resource_Description_Framework> has <http://purl.org/dc/elements/1.1/title> "Resource Description Framework" .
<http://de.wikipedia.org/wiki/Resource_Description_Framework> has <http://purl.org/dc/elements/1.1/publisher> "Wikipedia – Die freie Enzyklopädie" .

A query that finds out the title of a resource defined by the publisher “Wikipedia - The Free Encyclopedia” could look like this in SPARQL :

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?title
 WHERE {
  ?res dc:publisher ?pub .
  ?res dc:title ?title

  FILTER (
   sameTerm(?pub, "Wikipedia - Die freie Enzyklopädie")
  )
 }

The result is a table with exactly one entry (binding of the ? Title variable ) with the value Resource Description Framework .

history

The Meta Content Framework (MCF) in XML, a language that was developed by Ramanathan V. Guha in 1995–1997 and submitted to the W3C after moving to Netscape in June 1997, can be regarded as the forerunner of RDF . As part of the browser wars MCF was also a response to the Channel Definition Format from Microsoft . Instead of giving preference to MCF, the W3C decided to develop a general language for formulating metadata , which was to be called RDF. The first RDF standard was presented as a draft in August 1997 and published as a recommendation in February 1999. In 1999 the development of the RDF scheme began.

See also

literature

Web links

Individual evidence

  1. ^ Meta Content Framework Using XML. w3.org
  2. The RDF.net Challenge tbray.org
  3. ^ RDF Model and Syntax. w3.org
  4. w3.org
  5. w3.org