Java API for XML Processing
The Java API for XML Processing , or JAXP , is one of Java - XML - APIs . It is a lightweight standardized API for validating , parsing , generating, and transforming XML documents. The respective (non-standardized) implementation of the API is interchangeable (pluggable). The four basic interfaces are:
- the Document Object Model parser interface or DOM interface
- the Simple API for XML parser interface or SAX interface
- the Streaming API for XML or StAX interface (added in JDK 6; available separately as Jar for JDK 5)
- the XSLT interface to enable transformations of data and structures in an XML document.
The J2SE 1.4 JDK was the first JDK version that was published with an implementation of JAXP 1.1; the current JSE is supplied with Apache Xerces and an adapted variant of Xalan (for XSLT).
Versioning
Java SE version | included JAXP version |
---|---|
1.4 | 1.1 |
1.5 | 1.3 |
1.6 | 1.4 |
1.7.0 | 1.4.5 |
1.7.40 | 1.5 |
1.8 | 1.6 |
DOM interface
The DOM interface is very simple. It parses an entire XML document and creates a complete "in memory" representation of the document. It uses the classes and concepts of the specification that can be viewed under Document Object Model (DOM) Level 2 Core Specification.
The DOM parser is called DocumentBuilder because it provides an in-memory document representation. An instance of the class javax.xml.parsers.DocumentBuilder is created by the Factory class javax.xml.parsers.DocumentBuilderFactory . The DocumentBuilder first creates an org.w3c.dom.Document instance in the form of a tree structure that contains the nodes in the XML document. Each tree node in this structure implements the org.w3c.dom.Node interface. There are many different types of tree nodes that represent the respective data types from the XML document.
The most important nodes are:
- Element nodes, possibly with attributes
- Text nodes that reflect the text found between the start and end tags of a document element
For a complete list of node types, refer to the javadoc documentation of the org.w3c.dom package .
With the DOM-API you can work 'in both directions', i.e. from XML to "in memory" DOM as well as from DOM to XML. So it is not only suitable for "parsing" XML, but also for generating XML (streams or files).
SAX interface
The SAX parser, also called SAXParser, is created by javax.xml.parsers.SAXParserFactory . In contrast to the DOM parser, the SAXParser does not create an "in-memory" representation of an XML document, which makes it faster and less demanding in terms of memory consumption. In contrast, the SAXParser informs the client of the XML document structure through callback functions (callbacks), i. H. methods of the DefaultHandler instance available to the parser are executed.
The DefaultHandler class is in the org.xml.sax.helpers package . This implements the ContentHandler, the ErrorHandler, the DTDHandler and the EntityResolver interface. Most clients are interested in the methods from the ContentHandler interface.
The ContentHandler methods, implemented by the DefaultHandler, are called as soon as the SAX parser encounters the corresponding elements of the XML document. The main methods in this interface are:
- the startDocument () and endDocument () methods, which are called on the start and end tags of an XML document.
- the startElement () and endElement () methods, which are called on the start and end tags of a document element .
- the characters () method. This is called with the content between the start and end tags of the respective XML document element.
The client offers a subclass of the DefaultHandler , which overwrites these methods and processes the data. This can also include storing the data in a database or writing to a stream .
With the SAX-API one can work 'only in one direction', namely from the XML 'into' Java. It is therefore only suitable for "parsing" XML. With SAX you cannot create XML (streams or files).
XMLPULL interface
The Streaming API for XML (StAX) has been part of JAXP since JAXP 1.2 and thus JSE 6 and J2EE 1.4 . This is used to read XML data using so-called XMLPULL parsers. XMLPULL is similar to SAX, except that the parser does not send information to the application via an event mechanism ( "PUSH" ) as with SAX , but the application gets the next information itself when it needs it (" PULL "). XMLPULL parsers are usually more efficient than SAX parsers.
XSLT interface
The e x tensible S tylesheet L anguage for T ransformations, abbreviated XSLT , the conversion allows an XML document to other forms of data.
XSD validation
Validation of XSD files is supported from JAXP 1.2. JAXP 1.2 is part of the Java platform from JSE 6.0 and J2EE 1.4.
Individual evidence
- ↑ https://www.jcp.org/en/jsr/detail?id=206
- ↑ http://www.w3.org/TR/2000/REC-DOM-Level-2-Core-20001113
Web links
- Oracle's JAXP product description
- Example program using the DOM parser and the SAX parser Tutorial: XML with Xerces for Java