JBIG2

from Wikipedia, the free encyclopedia

JBIG2 is a method for image compression of binary images for both lossless and lossy compression. JBIG2 was developed by the "Joint Bi-level Image Experts Group" and was published in 2000 as the international standard ITU T.88 and in 2001 as ISO / IEC 14492. It is a further development of JBIG .

functionality

Although the JBIG2 standard relates only to decoding, the encoder is expected to divide the pages of the input documents into three types of regions: text, graphics, and generic regions. The latter reproduces objects that cannot be classified as text or images, for example lines or noise.

A text region consists of a number of symbols that are placed on a background. Typically, a symbol corresponds to a character (e.g. letter) that appears in a text. The symbols are stored in a symbol dictionary and can be reused by specifying their indices. The storage in the dictionary takes place either as a coded bitmap or as a refinement of another dictionary entry, whereby only the difference to the template is saved. With lossy compression, even slightly different symbols refer to the same symbol dictionary entry.

Raster graphics are compressed by reconstructing grayscale images and frequently occurring patterns are stored in a library. Lossless and lossy coding are handled like text regions.

The regions specified by the encoder do not have to be disjoint . Possible overlapping areas are offset by means of the specified combination operators (OR, AND, XOR, XNOR or REPLACE).

JBIG2 files are divided into segments. For example, a document page consists of a page information segment, a symbol dictionary segment, a text region segment, a sample dictionary segment, a halftone region segment, and an end-of-page segment. The dictionary segments contain raster graphics that are referenced by the region segments. Because symbols and patterns on different pages can refer to the same dictionary segment, there is cross-page compression. Segments are clearly numbered and consist of a segment header, a data header and data. The segment header contains the segment number, if other segments are referenced in the data part, also their segment numbers, and the page number on which the decoded graphic is to be placed or, for global segments, the value 0.

Compression
method Three different methods are used for compression:

use

JBIG2 data can appear as independent files or embedded in other file formats such as PDF (version 1.4 or higher).

Open source decoders for JBIG2 are jbig2dec (written in C ) and jbig2-imageio (written in Java ).

disadvantage

In the variant of JBIG2 with lossy compression, the use of non-identical symbols from the symbol dictionary can lead to falsification of small document details (e.g. numbers). If this error occurs, it is very difficult to detect visually - in contrast to errors in other compression methods that are usually clearly visible. This led to the scan copier incident in 2013.

Although only the JBIG2 variant with lossy compression is affected by this dangerous effect, the Federal Office for Information Security (BSI) has since March 16, 2015 every image compression method with symbol coding - in particular this includes JBIG2 compression - as unsuitable for the legally compliant substitute scanning. The “Coordination Office for the Permanent Archiving of Electronic Documents” in Switzerland comes to the same conclusion.

Swap digits on scan copiers

In August 2013, David Kriesel made it public that Xerox copiers were incorrectly displaying digits in the scans. The bug was only discovered eight years after it was published because the numbers given for the room sizes on a copy of the building plan did not match the drawing. The error could be reproduced by many users on other models. The number of all affected devices was estimated at 200,000–300,000. For example, in the case of drug dosage information, construction plans for bridges or in the financial sector, this incorrect data can have unpredictable consequences. The incorrect implementation of JBIG2 or the poor parameterization only affected scanning, not copying, printing or faxing. Important large companies, but also state institutions such as the military, used the faulty compression method for eight years.

Web links

Individual evidence

  1. Official JBIG Homepage
  2. jbig2dec homepage
  3. jbig2 plugin for Java's Image I / O
  4. BSI Technical Guideline 03138 Replacement Scanning. Abbreviation: BSI TR 03138 RESISCAN, Version 1.1. Federal Office for Information Security, March 2, 2017, p. 23 , accessed on May 1, 2017 .
  5. ^ JBIG2 compression studies
  6. BSI revises RESISCAN directive, prohibits JBIG2
  7. Xerox scan copiers change written numbers
  8. a b c David Kriesel : Don't trust a scan that you haven't faked yourself. youtube.com .
  9. Obviously other Xerox devices also affected .
  10. Information sheet on the Xerox scan copier incident .