XPath

from Wikipedia, the free encyclopedia

The XML Path Language ( XPath ) is a query language developed by the W3 consortium to address and evaluate parts of an XML document. XPath serves as the basis for a number of other standards such as XSLT , XPointer and XQuery . XPath is currently standardized in version 3.1 from March 21, 2017.

In web browsers , XSLT processors and other software, only XPath version 1.0 from 1999 is often supported, and in some cases XPath version 2.0 from 2007.

Principles

An XPath expression addresses parts of an XML document, which is viewed as a tree , although some differences to the "classic" tree of graph theory must be observed:

  • Nodes (nodes) of the tree represent the document node, XML elements, attributes, -Textknoten, commentaries, -Namensräume and processing instructions.
  • The axes preceding, following, preceding-siblingand following-siblingare not based solely on the tree, but also in the order of declaration of the elements in the XML document (linked-Tree).

An XPath -expression is composed of one or more localization steps (Steps Location) together. They are separated with the symbol /.

A localization step axis::node-test[predicate 1][predicate 2]... consists of:

  • Axis (axis) , and
  • Node test ,
  • optionally followed by one or more predicates (predicates) .

Any number of XPath expressions can be entered with the pipe symbol | unite quantitatively .

There are always different ways of expressing a desired set of nodes in XPath.

XPath operates on the logical document structure. This means, for example, that entities are already parsed or that any standard attributes and nodes that are specified by a schema are already contained in the tree.

axes

By specifying axes, the tree structure of the XML document is navigated starting from the current context node.

If the document node (the root of the XML document) is assumed, the XPath expression is /preceded by the character .

axis addressed nodes abbreviation Position in u. a. Tree
(based on element D)
the document node / (at the beginning of an XPath) Document node
child all directly subordinate child nodes (is omitted) E, G
parent the directly higher-level parent node .. B.
self the context node itself (useful for additional conditions) . D.
ancestor all parent nodes B, A
ancestor-or-self all higher-level nodes including the context node D, B, A
descendant all child nodes .// E, F, G
all nodes of the document except the document node // A, B, C, D, E, F, G, H, I, J, K, L
descendant-or-self all subordinate nodes including the context node D, E, F, G
following Subsequent in the XML document (without subordinate nodes of the selected node) H, I, J, K, L
following-sibling like following, but at the same time from the same parentoriginating HI
preceding preceding in the XML document (without superordinate nodes) C.
preceding-sibling like preceding, but at the same time from the same parentoriginating C.
attribute Attribute node @
namespace Namespace nodes from the attribute xmlnsderived

This tree exemplifies the structure of an XML document

 
 
 
 
 
 
Document node
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
A.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
B.
 
L.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
C.
 
(D)
 
H
 
I.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
E.
 
G
 
J
 
K
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
F.
 
 
 
 
 
 
 
 
 
 
 
 

The five axes self, ancestor, descendant, precedingand followingform starting from an arbitrary node of the document tree completely and without overlap from.

Node tests

Node tests (written ) restrict the element selection of an axis: Achse::Knotentest

  • Specifying an element name selects all corresponding elements.
    Example: /descendant-or-self::Fooselects all elements in the document that have the name "Foo".
  • With the sign *you can choose any element.
    Example: /descendant-or-self::Foo/child::*selects all elements in the document that are children of elements with the name "Foo".
  • With text(), comment()and processing-instruction()nodes of a certain type can be selected.

Predicates

The result can be further restricted by specifying predicates. Predicates are enclosed in square brackets and can be written in any number one after the other, whereby the order is essential. Predicates can contain XPath expressions, and a variety of functions and operators can be used. They are for example:

  • Access index (counting starts at 1)
  • Relation sign: = != and or < > <= >=
  • String functions:
    • normalize-space() - Removal of spaces at the beginning and end of the string and reduction of consecutive spaces to one
    • substring() - Select a partial string
    • substring-before(source, splitter) - Select a substring before the first occurrence of the separator
    • substring-after(source, splitter) - Select a substring after the first occurrence of the separator
    • string-length() - length of the string
  • Numerical operators: + - * div mod
  • Node set functions:
    • count() - Number of nodes in a node set
    • id()- Selects elements via the DTD ID
    • name() - Name of the node

Examples:

  • //child::Buch/Kapitel All chapters of all books.
  • //child::Buch/Kapitel[1] All first chapters of all books.
  • //child::Buch[count(./Seite)<=100][count(./Seite)>=10] returns all nodes of the type "book" that have at least 10 but at most 100 child elements of the type "page".

(does the same //Buch[count(Seite)<=100 and count(Seite)>=10])

  • substring-before($variable, ':') Selects the substring before the first colon from the value of the variable with the name variable

example

The following XML document is given:

<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<dok>
    <!-- ein XML-Dokument -->
    <kap title="Nettes Kapitel">
        <pa>Ein Absatz</pa>
        <pa>Noch ein Absatz</pa>
        <pa>Und noch ein Absatz</pa>
        <pa>Nett, oder?</pa>
    </kap>
    <kap title="Zweites Kapitel">
        <pa>Ein Absatz</pa>
        <pa format="bold">Erste Zeile</pa>
        <pa format="bold">Zweite Zeile</pa>
        <pa format="italic">Dritte Zeile</pa>
    </kap>
</dok>

Examples of XPath expressions:

Expression selected ...
/dok the first element dok
/* the outermost element regardless of name (every well-formed XML document has exactly one outermost element), heredok
//dok/kap all kapelements within all dokelements
//dok/kap[1] all first kapelements within all dokelements
//pa all paelements at all levels
//kap[@title='Nettes Kapitel']/pa all paragraphs of the chapters with the title "Nice Chapter".
//kap/pa[2] The second pa element in each of the two chapters.
//kap[2]/pa[@format='bold'][2] Second line with the format 'bold' in chapter 2.
child::* all children of the current node
child::pa all pachildren of the current node
child::text() all text nodes of the current node
. the current knot
./* all children of the current node
./pa all pachildren of the current node
pa all pachildren of the current node
attribute::* all attributes of the current node
namespace::* all namespaces of the current node
//kap[1]/pa[2]/text() Text content of the second paelement in the first kapelement (i.e. "Another paragraph")

XPath visualizers help to apply the sometimes complicated XPath queries to concrete XML files.

See also

literature

  • Michael Kay: XPath 2.0 Programmer's Reference . Wrox Press, 2004, ISBN 0-7645-6910-4 (English)
  • Margit Becher: XML - DTD, XML-Schema, XPath, XQuery, XSLT, XSL-FO, SAX, DOM . W3L Verlag, Witten 2009, ISBN 978-3-937137-69-8 .

Web links

Wiktionary: XPath  - explanations of meanings, word origins, synonyms, translations

Individual evidence

  1. w3.org