Structured Data Exchange Format

from Wikipedia, the free encyclopedia

Structured Data eXchange Format ( SDXF ) is a hierarchically structured data format.

With this format, almost any structured data can be recorded for the purpose of exchange. This format is equally suitable as a file format and as a networking message format.

SDXF allows structuring at any depth; the individual data elements are self-describing. The format is deliberately kept simple, but it should still be transparent, which means that the programs can access the data elements via basic functions and tools and the programmer, as the user, does not need to worry about the exact structure of the data structure.

SDXF also supports the exchange of data across different computer architectures through transparent implementation of the data.

SDXF is defined and published under RFC 3072 .

Structured data are structures that are recognized and processed as such by the programs involved. Normal text is "structured" in lines, paragraphs and so on, but is not considered structured in the sense of this definition.

example

Two companies with a business relationship decide to process their invoices electronically. An “invoice” is then an electronic document with the following hierarchically nested structure.

RECHNUNG
 │
 ├─ RECHNUNGS_NR
 ├─ DATUM
 ├─ ANSCHRIFT_SENDER
 │    ├─ NAME
 │    ├─ NAME
 │    ├─ STRAßE
 │    ├─ PLZ
 │    ├─ ORT
 │    └─ LAND
 ├─ ANSCHRIFT_EMPFÄNGER
 │    ├─ NAME
 │    ├─ NAME
 │    ├─ STRAßE
 │    ├─ PLZ
 │    ├─ ORT
 │    └─ LAND
 ├─ RECHNUNGS_SUMME
 ├─ EINZELPOSTEN_ALLE
 │    ├─ EINZELPOSTEN
 │    │    ├─ ANZAHL
 │    │    ├─ ARTIKELNUMMER
 │    │    ├─ ARTIKELTEXT
 │    │    ├─ PREIS
 │    │    └─ SUMME
 │    └─ …
 ├─ KONDITIONEN
 …

Similar and more complex structures need to be covered in SDXF. The basic construction element is the "chunk". The entire SDXF structure is a chunk and a chunk can consist of several chunks.

A chunk is very simple. It consists of a header of six bytes, followed by the actual data. The header contains a description of the chunk as a 2-byte binary number (chunk ID), as well as the length of the subsequent data and an identification of the type of data and any additional information (compression, encryption and others).

The data type informs about whether the data consists of text, whether it represents a binary number (integer or floating point) or whether the data is made up of a series of further chunks (structured chunk).

The existence of structured chunks makes it easy to pack hierarchical constructions like the above INVOICE into an SDXF structure. First of all, each of the terms mentioned (INVOICE, INVOICE_NO., DATE, ADDRESS_SENDER, etc.) must be assigned a unique number (chunk ID) from the range of numbers 1 to 65535. Then the structured chunk INVOICE (level 1) is to be constructed as the first chunk, which consists of a series of additional chunks of level 2: INVOICE_NO, DATE, ADDRESS_SENDER, ADDRESS_RECHNUNGS_RECHNUNGS_RECHNUNGS_RECHNUNGS_TEMPFÄNGER, INVOICE_SUMME, INDIVIDUAL ITEMS_ALL, CONDITIONS. Some of these 2nd level chunks are again structured: the two ADDRESSES and INDIVIDUAL ITEMS_ALL. With the date, you have the option of specifying this in text format, for example in the form YYYY-MM-DD ( ISO 8601 ) or as a structure consisting of the 3 numerical chunks YEAR, MONTH, DAY.

In the RFC, a chunk is described in detail on page 3:

The definition of the SDXF concept provides that a programmer works with the SDXF structures with a precisely defined set of machining functions.

Basic functions

Functions for reading out chunks
function description
init Initialize the parameter structure and link it to an existing chunk for analysis.
enter Entry into a structured chunk; the 1st chunk of the structure is made available (cursor).
leave Leaving the current structure; the cursor points to this structure chunk.
next Makes the next chunk available; the cursor is advanced.
extract Transfer data from the chunk positioned by the cursor to a user area.
select Makes the next chunk available within the structure with the specified name.
Functions for constructing chunks
function description
init Initialize the parameter structure and link an empty output buffer to create a new chunk structure.
create Create a new chunk and append it to the existing structure (if available).
append Append a full chunk to an SDXF structure.
leave Leaving the current structure; the cursor points to this structure chunk.

Example of a builder

A creation program for the invoice example could look something like this:

init (sdx, buffersize=1000);   // initialisieren der SDXF Parameterstruktur sdx
create (sdx, ID=RECHNUNG, datatype=STRUCTURED); // beginn der Hauptstruktur
create (sdx, ID=RECHNUNGS_NR, datatype=NUMERIC, value=123456); // elementares Chunk
create (sdx, ID=DATUM, datatype=CHAR, value="2005-06-17"); // elementares Chunk
create (sdx, ID=ANSCHRIFT_SENDER, datatype=STRUCTURED); // Unterstruktur
create (sdx, ID=NAME, datatype=CHAR, value="Karl Napp"); // Unterstruktur

create (sdx, ID=LAND, datatype=CHAR, value="Schweiz"); // Unterstruktur
leave; // abschließen der Unterstruktur ANSCHRIFT_SENDER

leave; // abschließen der Hauptstruktur RECHNUNG

The syntax here is fictitious for simplicity, a full example in C can be found on the PINPI website.

Example of a readout program

The readout follows the given structure:

init (sdx, container=Adr. SDXF-Struktur);   // initialisieren der SDXF Parameterstruktur sdx
enter (sdx); // "Einstieg" in die RECHNUNGs-Struktur

do while (sdx.rc == SDX_RC_ok)
{
    switch (sdx.Chunk_ID)
   {
       case RECHNUNGS_NR:
         extract (sdx);
         rechnr = sdx.value;  // ganzzahlige numerische Werte werden in value abgelegt
         break;
         //
       case DATUM:
         extract (sdx);
         strcpy (rechdate, sdx.data); // data ist ein Zeiger auf die extrahierte Zeichenkette
         break;
         //
       case ANSCHRIFT_SENDER:
         enter (sdx);  // da die ANSCHRIFT eine Struktur ist
         do while (sdx.rc == SDX_RC_ok)
          
         break;
      
   }
}

The individual SDXF elements (chunks) are created piece by piece. This may seem awkward at first, but:

  1. this reflects exactly the reality of everyday programming: The individual elements are available in various program variables or database fields, or must be read into them.
  2. This is the only way to ensure that a structure with mixed data types is processed (standardized) in such a way that it can be transferred from one computer to the other (via file transfer or network) without adaptation. Independent of the computer architecture. This takes into account the problem of the different representation of data in character representation and byte swapping. This problem is completely relieved of the application programmer by the presented SDXF functions.
  3. Encryption (among others according to AES ) and compression ( zip ) are also taken over by the SDXF functions (individual parts can also be compressed and / or encrypted.) When reading out, it is automatically decompressed and decrypted (password must of course be correct).

If the entire structure can be edited as an editable text file, the SDEF format can be used.

Footnotes

  1. RFC 3072 page 3, description of a chunk
  2. SDXF example on the PINPI website
  3. http://www.pinpi.com/de/SDXF_4.htm
  4. http://www.pinpi.com/de/SDXF_5.htm
  5. http://www.pinpi.com/de/sdef.htm

Web links