JSON streaming

from Wikipedia, the free encyclopedia

The term JSON streaming summarizes various methods that enable JSON -coded data to be processed during transmission or to add data that has already been sent or written (and can no longer be changed).

Problem Description

In order to combine several data (usually of the same type) into one unit, there is the data type in JSON Array. A JSON array begins with an opening square bracket, followed by the array elements, which are separated by commas, and ends with a closing square bracket:

[ Element1 , Element2 , ]

However, this format is not "streaming-capable" because a JSON parser only accepts the array with the closing square brackets and cannot process it beforehand. Once an array has been closed, it cannot be "opened" or expanded again afterwards.

possible solutions

The problem described above can be solved in different ways:

Laxer parser

The parser in the receiver could process the elements of the array as soon as it has clearly recognized the end of the element:

  • Strings, objects and arrays with the closing quotation mark or the closing bracket
  • null, true, falseAnd numbers at the first sign, according to this value (usually a space, carriage return or comma)

This approach is problematic if it can happen that the array is never closed or contains invalid characters, since then the entire array should have been discarded because it does not represent a valid JSON.

Line-delimited JSON

JSON is a textual format that allows different formatting. In particular, more complex data objects can be formatted more easily "human-readable" using spaces and line breaks at suitable places. Conversely, the representation can be compacted in such a way that a data element does not contain any line breaks (line breaks within character strings must be \ncoded anyway ).

If the sender and recipient have agreed on such "single-line" formatted data elements, the end-of-line character can be used as a separator:

{"id":123, "name":"Jane Doe", "value":4711}
{"id":666, "name":"E. Teufel", "value":-1}

This format is used by various JavaScript frameworks and is also known as LDJSON, NDJSON or JSONL.

Pro This format allows processing by programs that process data line by line without worrying about the content of the lines.

Cons The individual data objects may have to be reformatted so that they do not contain any line breaks.

Record separator delimited JSON

As a text format, JSON (apart from the characters 09 hex (Tab), 0A hex (Newline) and 0C hex (CR)) must not contain any ASCII control characters. Any control characters in character strings must be replaced by appropriate escape sequences.

Thus it is possible (as an incompatible extension) to define a special ASCII control character as a separator for the individual data elements.

Usually the character 1E hex is used for this , the name of which “Record Separator” reminds you that it was originally intended as a separator for structured data.

This format is also called JSON text sequences , has its own MIME type application/json-seq and is specified in RFC 7464 . Here, however, the RS character is not used as a separator between the entries, but as a start character at the beginning of each entry. A line break is also inserted after each entry. When parsing, it must be noted that further line breaks can occur within an entry.

Cons: Requires customized parsers and generators.

Pro 1 from other JSON sources adopted: The individual data elements need not be reformatted consuming but can 1

Pro Pre-processing that separates the individual entries from each other, only scanning requires after the RS character and is therefore very easy.

Concatenated JSON

The data elements are transmitted one after the other without any separators. For strings, arrays and objects this is not a problem for suitable parsers; other data types require at least one space or line break in order to be able to process the data.

Usually a line break is simply sent after each data element, even if it is not necessary for unambiguous parsing.

Pro Most JSON parser several JSON data elements are easily been able to read from a data stream.

Cons It is not possible to separate the individual data elements from one another in a preprocessing step without parsing the complete JSON format.