Efficient XML Interchange

from Wikipedia, the free encyclopedia
Efficient XML Interchange (EXI)
File extension : .exi
MIME type : application / exi
Magic number : 2445.5849 hex
$ EXI

( ASCII-C notation )

Developed by: World Wide Web Consortium
Current version: 1.0 (as of September 19, 2008)
Type: Binary XML
Extended by: XML
Standard (s) : Format 1.0 (Recommendation)
Website : Efficient XML Interchange Working Group

Efficient XML Interchange (EXI) is a format proposed by the World Wide Web Consortium (W3C) for the binary representation of XML information sets . Compared to text-based XML documents , documents in EXI format can be processed more quickly and require less bandwidth when transferred over a network. In addition to EXI, there are other approaches to establish a binary representation for XML (see Binary XML ).

history

Based on the conclusions of the XML Binary Characterization Working Group , the Efficient XML Interchange (EXI) working group was founded in November 2005 with the aim of defining a binary description format for XML. After analyzing and comparing several approaches (including XML + gzip, Fast Infoset, Fujitsu Binary, Xebu and esXML), Efficient XML was chosen as the basis for EXI in November 2006. In July 2007 the first draft for the Efficient XML Interchange standard was published.

The working group planned to publish EXI in September 2009 as a W3C recommendation. In January 2011 a proposal for a W3C recommendation was published and in March 2011 the W3C recommendation based on it was published. In February 2014 a 2nd edition was published.

In November 2016 the working group changed its name from "Efficient XML Interchange (EXI)" to "Efficient Extensible Interchange (EXI)" in order to take into account the broad applicability of the format.

concept

The algorithm uses a grammar to determine what is likely to occur at a particular point in an XML document. The most likely alternative is then coded with fewer bits than the less likely (cf. entropy coding ). This general algorithm can be applied to any language described by a grammar (e.g. SVG , Java , HTML , etc.). EXI is optimized for XML languages, the EXI4JSON example again shows the applicability of the method to JSON documents.

The grammar allows any XML document or fragments thereof as input. In order to be able to make more precise predictions about what will occur at a certain point, the grammar can be extended by various schemes (e.g. DTD or XML scheme ).

With the help of grammar, the encoder generates a stream of events from the input , which consists of a series of simple codes of variable length. These event codes are similar to Huffman codes , but are much easier to calculate and maintain. In addition, the event codes can be compressed by run length coding .

Magic number

To distinguish EXI streams from XML streams, two distinction bits were introduced. The first two bits of the first octet must have the values ​​'1' and '0' in exactly the same order. This order is not possible in well-formed XML 1.0 documents in the usual character encodings . However, in order to guarantee the differentiation for possible future codings, the introduction of a magic cookie was suggested early on .

The specification for format 1.0 specifies that the EXI header can begin with the so-called EXI cookie , the ASCII character string $EXI(0x24455849). The two distinguishing bits must follow immediately. Except for the first two and fourth bits of the first octet (after the EXI cookie), the other five are variable. Theoretically, this results in 32 different magic numbers .

The use of the EXI cookie is optional, but it is strongly recommended in the specification.

example

An EXI stream of version 1 with an EXI cookie and without EXI options would start with the following bytes:

24 45 58 49 80

An EXI stream from version 16 with EXI cookie and EXI options would start as follows:

24 45 58 49 AF

Implementations

A detailed description of the implementations can be found on the website of the Interchange Working Group.

  • EXIficient: An open source project supported by Siemens for the implementation of the EXI specification in Java, C / C ++, JavaScript.
  • Efficient XML: An implementation of the EXI specification in Java, .NET and C ++ that is commercially marketed by AgileDelta
  • OpenEXI: An open source project promoted by Fujitsu , Naval Postgraduate School (NPS) and OptimaLogic to implement the EXI specification in Java.
  • Exi-Connexion - Open Source Implementation of the EXI Working Draft from March 26, 2008.
  • EXIP: Open source project from Luleå University in C.
  • OSS EXI Tools from OSS Nokalva in C / C ++ and .NET

See also

Web links

swell

  1. proposed
  2. Oliver Goldman, Dmitry Lenkov: XML Binary Characterization. W3C, March 31, 2005, accessed September 7, 2009 .
  3. Lightning-Fast Delivery of XML to More Devices in More Locations. AgileDelta, July 8, 2007, accessed September 7, 2009 .
  4. ^ Charter of the Efficient XML Interchange Working Group. W3C, accessed September 7, 2009 .
  5. ^ John Schneider, Takuki Kamiya: Efficient XML Interchange (EXI) Format 1.0. W3C Proposed Recommendation 20 January 2011. W3C, January 20, 2011, accessed March 17, 2011 .
  6. ^ John Schneider, Takuki Kamiya: Efficient XML Interchange (EXI) Format 1.0. W3C Recommendation 10 March 2011. W3C, March 10, 2011, accessed March 17, 2011 .
  7. ^ John Schneider, Takuki Kamiya, Daniel Peintner, Rumen Kyusakov: Efficient XML Interchange (EXI) Format 1.0 (Second Edition). W3C Recommendation 11 February 2014. W3C, February 11, 2014, accessed March 9, 2017 .
  8. ^ Daniel Peintner: Efficient representation for Web formats. In: W3C Blog. November 22, 2016, accessed February 28, 2017 .
  9. ^ Daniel Peintner and Don Brutzman editors: EXI for JSON (EXI4JSON) . In: Public Working Draft . World Wide Web Consortium. August 23, 2016. Retrieved September 23, 2016.
  10. ^ John Schneider, Takuki Kamiya: Efficient XML Interchange (EXI) Format 1.0. W3C Working Draft September 19, 2008. W3C, December 19, 2008, accessed on September 7, 2009 : “Unlike the optional EXI cookie that MAY occur to precede this field, the presence of Distinguishing Bits is REQUIRED in the EXI header. It is used to distinguish EXI streams from text XML documents in the absence of an EXI cookie. This two bit sequence is the minimum that suffices to distinguish EXI streams from XML documents since it is the minimum length bit pattern that cannot occur as the first two bits of a well-formed XML document represented in any one of the conventional character encodings, such as UTF-8, UTF-16, UCS-2, UCS-4, EBCDIC, ISO 8859, Shift-JIS and EUC, according to XML 1.0 "
  11. Daniel Peintner, Santiago Pericas-Geertsen: Efficient XML Interchange (EXI) primer. W3C, December 19, 2007, accessed on September 7, 2009 (English, Editorial note): “The integration of a magic cookie is under consideration by the EXI WG. A magic cookie would allow distinguishing an EXI document from formats other than XML or from future character encodings "
  12. ^ Efficient XML Interchange Working Group. EXI implementations. W3C, March 12, 2011, accessed March 17, 2011 .