JPEG

from Wikipedia, the free encyclopedia
logo
An image with quality levels decreasing from left to right

JPEG ([ ˈdʒeɪpɛɡ ]) is the common name for the ISO / IEC 10918-1 or CCITT Recommendation T.81 standard presented in 1992 , which describes various methods of image compression . The term "JPEG" goes back to the Joint Photographic Experts Group , which developed the JPEG standard.

JPEG suggests various compression and encoding methods, including lossy and lossless compression , different color depths, and sequential or progressive modes (normal image composition or gradual refinement). Only lossy compression is widespread in sequential or progressive mode and 8-bit color channels .

The JPEG standard only describes image compression processes, but does not specify how the resulting data should be saved. Commonly, “JPEG files” or “JPG files” are files in the graphic format JPEG File Interchange Format (JFIF). However, JFIF is just one way of storing JPEG data; SPIFF and JNG are other, albeit rarely used, options.

JPEG / JFIF supports a maximum image size of 65,535 × 65,535 pixels or 65,535 pixels on the longest side of the image.

Overview and standards

The JPEG standard ISO / IEC 10918-1 defines the following modes, of which only those with a colored background are used:

Sequentially (Sequential) Progressive (Progressive) Lossless (Lossless) Hierarchically (Hierarchical)
Huffman coding Arithmetic coding Huffman coding Arithmetic coding
8 bit 12 bit 8 bit 12 bit 8 bit 12 bit 8 bit 12 bit

In addition to the lossy mode defined in ISO / IEC 10918-1, there is also the improved, lossless compression method JPEG-LS , which was specified in another standard. There is also the JBIG standard for the compression of black and white images.

JPEG and JPEG-LS are defined in the following standards:

JPEG (lossy and lossless): ITU-T T.81 (PDF; 1.1 MB), ISO / IEC IS 10918-1
JPEG (extensions): ITU-T T.84
JPEG-LS (lossless, improved): ITU-T T.87, ISO / IEC IS 14495-1

The JPEG standard has the official title Information technology - Digital compression and coding of continuous-tone still images: Requirements and guidelines . The “joint” in the name comes from the collaboration between ITU , IEC and ISO .

The JPEG compression

The JPEG standard defines 41 different sub-file formats, of which mostly only one is supported (and which covers almost all use cases).

The compression is done by applying multiple processing steps, four of which are lossy.

The data reduction takes place through the lossy processing steps in cooperation with the entropy coding .

Compressions of up to around 1.5–2 bits / pixel are visually lossless, at 0.7–1 bits / pixel good results can still be achieved, below 0.3 bits / pixel JPEG becomes practically useless, the image is increasingly characterized by unmistakable compression artifacts ( Block formation, stepped transitions, color effects on gray wedges). The successor JPEG 2000 is much less susceptible to this type of artifact.

If you look at 24-bit RGB files as the source format, you get compression rates of 12 to 15 for visually lossless images and up to 35 for still good images. In addition to the compression rate, the quality also depends on the type of images. Noise and regular fine structures in the image reduce the maximum possible compression rate.

The JPEG Lossless Mode for lossless compression uses a different method ( predictive coder and entropy coding ).

Color model conversion

Original color picture above and the splitting of this picture into the components Y, Cb and Cr. The low perceived contrast in the color components Cb and Cr makes clear why the resolution of the color information can be reduced (undersampling) without significantly impairing the image impression.

The output image , which is mostly available as an RGB image, is converted into the YCbCr color model. Basically the YPbPr scheme according to CCIR 601 is used:

Since the R'G'B'-values ​​are already available digitally as 8-bit numbers in the range {0, 1, ..., 255}, the YPbPr components only have to be shifted (renormalized), whereby the components Y '( luminance ), Cb ( color blueness ) and Cr ( color redness ) arise:

The components are now in the value range {0, 1,…, 255}.

When converting the color model, the usual rounding errors occur due to limited calculation accuracy and an additional data reduction, since the Cb and Cr values ​​are only calculated for every second pixel (see CCIR 601 ).

Low pass filtering of the color difference signals

The color deviation signals Cb and Cr are mostly stored in reduced resolution. To do this, they are low-pass filtered and sub-sampled (in the simplest case by averaging).

Usually vertical and horizontal subsampling is used by a factor of 2 (YCbCr 4: 2: 0), which reduces the data volume by a factor of 4. This conversion makes use of the fact that the spatial resolution of the human eye is significantly lower for colors than for brightness transitions.

Block formation and discrete cosine transformation

Each component (Y, Cb, and Cr) of the image is divided into 8 × 8 blocks. These are subjected to a two-dimensional discrete cosine transformation (DCT):

With

Instead of 64 individual points, each 8 × 8 block is shown as a linear combination of these 64 blocks
The compressed 8 × 8 squares can be seen in the enlargement.

This transformation can be implemented with very little effort using the fast Fourier transformation (FFT). The DCT is an orthogonal transformation , has good energy compression properties and there is an inverse transformation, the IDCT (which also means that the DCT is lossless, no information was lost, as the data was only brought into a form that is more favorable for further processing were).

Quantization

As with all lossy coding methods, the actual data reduction (and deterioration in quality) is achieved through quantization. To do this, the DCT coefficients are divided by the quantization matrix (divided by element) and then rounded to the nearest integer:

In this rounding step, an irrelevance reduction takes place. The quantization matrix is ​​responsible for both the quality and the compression rate. It is saved in the header in JPEG files (DQT marker).

The quantization matrix is ​​optimal when it roughly represents the sensitivity of the eye to the corresponding spatial frequencies. The eye is more sensitive to coarse structures, so the quantization values ​​for these frequencies are smaller than those for high frequencies.

Here is an example of a quantization matrix and its application to an 8 × 8 block of DCT coefficients:

where is calculated with:

etc.

Re-sorting and differential coding of the constant component

Zigzag order of the image components

The 64 coefficients of the discrete cosine transform are sorted by frequency . This results in a zigzag sequence, starting with the direct current component with the frequency 0. After the English direct current (for direct current ) it is abbreviated with DC , here it denotes the average brightness. The coefficients with a high value are usually first and the small coefficients further behind. This optimizes the input of the subsequent run length coding . The rearrangement order looks like this:

 1  2  6  7 15 16 28 29
 3  5  8 14 17 27 30 43
 4  9 13 18 26 31 42 44
10 12 19 25 32 41 45 54
11 20 24 33 40 46 53 55
21 23 34 39 47 52 56 61
22 35 38 48 51 57 60 62
36 37 49 50 58 59 63 64

Furthermore, the DC component is coded again differentially to the block to the left of it and in this way the dependencies between neighboring blocks are taken into account.

The above example leads to the following re-sorted coefficients

119  …
 78   3  -8  0 -4  7 -1  0 -1  0  0  0 -2  1  0  1  1 -1 0 …
102   5  -5  0  3 -4  2 -1  0  0  0  0  1  1 -1  0  0 -1 0 0 0 0 0 0 0 1 0 …
 75 -19   2 -1  0 -1  1 -1  0  0  0  0  0  0  1 …
132  -3  -1 -1 -1  0  0  0 -1  0 …

The difference coding of the first coefficient then gives:

-41   3  -8  0 -4  7 -1  0 -1  0  0  0 -2  1  0  1  1 -1 0 …
 24   5  -5  0  3 -4  2 -1  0  0  0  0  1  1 -1  0  0 -1 0 0 0 0 0 0 0 1 0 …
-27 -19   2 -1  0 -1  1 -1  0  0  0  0  0  0  1 …
 57  -3  -1 -1 -1  0  0  0 -1  0 …

In structurally poor regions (of the same picture) the coefficients can also look like this:

 35 -2  0 0 0 1 0 …
  4  0  1 0 …
  0  0  2 0 1 0 …
-13  0 -1 …
  8  1  0 …
 -2  0 …

These areas can of course be better coded as structurally rich areas. For example, by run length coding.

The zigzag rearrangement of the DCT coefficients does fall under the scope of protection of US Pat. No. 4,698,672 (and other applications and patents in Europe and Japan). However, in 2002 it was found that the claimed prior art method was not new, so that the claims would hardly have been enforceable. In the meantime, the patents from the patent family for the aforementioned US patent have also expired due to the passage of time, such as EP patent 0 266 049 B1 in September 2007.

Entropy coding

A Huffman coding is usually used as entropy coding . The JPEG standard also allows arithmetic coding . Although this generates between 5 and 15 percent smaller files, it is rarely used for patent reasons, and this encoding is significantly slower.

The JPEG decoding

The decompression (usually called decoding) takes place inversely to the compression:

  • Entropy decoding
  • Rearrangement
  • Requantization
  • Inverse Discrete Cosine Transformation.
  • Oversampling and low-pass filtering of the color difference signals U and V (lossy)
  • Color model conversion from the YCbCr color model to the target color space (mostly RGB )

The decompression is (largely) lossless, but the inverse decoder problem occurs. It is difficult to reconstruct the original file from decoded data. A decoding-encoding process changes the file and is therefore not loss-free; generation losses occur as with analog copying.

However, the generation losses of JPEG are comparatively small if the same quantization table is used again and the block boundaries are the same. With suitable boundary conditions it can even be avoided with JPEG. This is no longer the case with JPEG-2000 (overlapping transformations, such as those used in JPEG-2000 as well as in audio data compression , require utopian computing power).

Inverse discrete cosine transform

The inverse transformation, the IDCT, exists for the DCT:

with the same correction factors as for the DCT.

Color model conversion

The back calculation from the YCbCr color model into the RGB color space takes place via the inverse matrix of the back calculation, it is:

With:

Progressive JPEG

A JPEG image is made up of coefficients. These do not store pixels, but approximations of the entire image content of an 8 × 8 image block. With progressive JPEG, first the first coefficients of each image block, then the second, etc. are stored in sequence so that the approximation to the original image becomes better and better.

As with the interlacing used in GIF, the purpose is to give the user a quick, rough preview image before the entire file is loaded. This is particularly useful if loading an image takes longer than half a second to a full second or if you only need a preview image. However, large images are usually offered for download in normal JPEG mode.

Lossless post-processing of JPEG

Lossless visual post-processing

Losses when rotating and saving a JPEG image with “crooked” resolution 1021 × 767 (not divisible by 16).
The repeated rotation of JPEG images with a resolution divisible by 16 (e.g. 1024 × 768) when using the same quantization matrix, however, is lossless (if implemented correctly).

Although decoding and recoding is usually lossy, some image manipulations can (in principle) be carried out without undesired data loss:

  • Image rotations by 90 °, 180 ° and 270 °
  • horizontal and vertical image mirroring
  • Trimming of edges by multiples of 16 pixels (or 8 pixels for black and white images or color images without undersampling)

To do this, the entropy coding and the zigzag rearrangement must be reversed. The operations then take place on the basis of the DCT coefficients (re-sorting, omitting blocks that are not required). Then the zigzag re-sorting and the entropy coding take place again. There are no more lossy work steps. Not every program performs these operations losslessly, it requires special processing modules that are specific to the file format. This is usually not the case with the popular image processing programs , as they usually first decode the file into a bit pattern and then work with this data.

For example, the console program jpegtran, which is available for Windows and Linux, can perform all these operations losslessly, as does the GUI-based IrfanView for Windows.

Images with a resolution not a multiple of 16 pixels (or 8 pixels for black and white images or color images without undersampling) are problematic. They have incomplete blocks, that is, blocks that do not use all of the synthesized pixels. JPEG only allows such blocks on the right and lower edge of the image. Some of these operations therefore require that these marginal strips be discarded once.

Lossless extended compression of the data

The company Dropbox has developed a program that uses arithmetic coding to reduce the space required by existing JPEG files by an average of around 20 percent. The program is called Lepton and is under the Apache 2.0 license , an open source license .

Visual quality and related formats

JPEG compression was developed for natural (raster) images, such as those found in photography or computer-generated images .

JPEG is unsuitable for

  • digital line drawings (e.g. screenshots or vector graphics) that contain many high-frequency image parts (hard edges),
  • Black and white images with 1 bit per pixel,
  • rasterized images (newspaper printing).

In addition, you cannot create transparent graphics with JPEG.

Formats such as GIF , PNG, or JBIG are far more suitable for these images .

A subsequent increase in the quality factor increases the storage requirements of the image file, but does not bring back lost image information. The quantization tables can be chosen arbitrarily and are not standardized. However, many image processing programs let the user select a general quality factor between 0 and 100, which is converted into a quantization table according to a formula in the JPEG library published by the JPEG committee. Even with quality factors such as “100” or “100%”, there is still a quantization and thus a loss of quality, which is considerable in the case of images that are unsuitable for JPEG.

A JPEG transform is generally not idempotent . Opening and then saving a JPEG file results in a new lossy compression.

Images of different JPEG quality levels, from left to right: "90", "60", "20", detail ("20")

The example image compares images that have been encoded with different quality settings. The portrait shot has a size of 200 × 200 pixels. With a color depth of 8 bits per color and uncompressed storage, this image creates a 120 Kbyte file (excluding header and other meta information). The formation of blocks in the 8 × 8 pixel squares is shown enlarged in the right part of the picture. Another problem besides the formation of blocks is " ringing ", a consequence of the poor behavior of the DCT with hard color transitions.

In the professional sector, JPEG is rarely used because of the lossy data reduction. Instead, formats are used that compress without loss, regardless of the large amount of memory required. Examples are TIFF , BMP , TGA or PNG (full color mode). An uncompressed recording of 6 megapixels with a color depth of 16 bits per basic color and 3 basic colors requires a memory requirement of 36 Mbytes, which can only be reduced moderately in the case of structured, grainy or noisy images by lossless compression (for detailed photos compression rates are around 50 % common).

It is often possible to optimize the compression of existing JPEG files without further loss and thus reduce the file size somewhat. Newer versions of some packing programs are able to compress JPEG images by up to 25% without further loss.

The moving image compression processes MPEG-1 (closely related to the Motion JPEG codec ) and MPEG-2 are based on the JPEG standard. A follow-up project to JPEG for the storage of images is JPEG 2000 , which has better compression and many useful properties, but has not yet been able to gain broad acceptance, at least until now. Another potential successor format is JPEG XR , which is based on the HD Photo format developed by Microsoft , but which has also only been supported sporadically so far.

Another successor format, JPEG XL, is also being developed. It is said to offer a number of advantages, in particular improved compression based on the experimental PIK and FUIF formats.

Patent issues

Several companies (see patent trolls ) have already tried to use their (mostly wrongly granted ) patents to raise claims against software manufacturers whose products can read or create JPEG files. So far, all relevant patents have been subsequently withdrawn. Nevertheless, the claimants were able to collect millions in extrajudicial settlements.

Implementations

A very important implementation of a JPEG codec is the free program library libjpeg . It was first published in 1991 and was a key to the standard's success. They or a direct descendant are used in a vast number of applications.

In March 2017, Google presented a new open source JPEG encoder called “Guetzli” . This compresses image files better than previous methods. According to Google, Guetzli should generate JPEG files that are up to 35 percent smaller than conventional encoders and create fewer annoying compression artifacts . However, the coding speed (~ 0.01 MPixel / s) is around four orders of magnitude lower than that of standard JPEG encoders (100 MPixel / s) and is therefore only suitable in practice for very frequently used static images.

literature

  • Heiner Küsters: Image data compression with JPEG and MPEG. Franzis, Poing 1995, ISBN 3-7723-7281-3 .
  • Thomas W. Lipp: Graphic formats. Microsoft Press, Unterschleißheim 1997, ISBN 3-86063-391-0 .
  • John Miano: Compressed Image File Formats. Addison-Wesley, Reading 2000, ISBN 0-201-60443-4 (English).
  • William Pennebaker, Joan Mitchell: JPEG Still Image Data Compression Standard. Chapman & Hall, New York 1993, ISBN 0-442-01272-1 (English).
  • Tilo Strutz: Image data compression, basics, coding, wavelets, JPEG, MPEG. 4th, revised and supplemented edition, Vieweg + Teubner, Wiesbaden 2009, ISBN 978-3-8348-0472-3 .

Web links

Commons : JPEG  - collection of images, videos and audio files

Individual evidence

  1. jpeg.org 20:20, January 30, 2016
  2. ^ Digital Compression And Coding Of Continuos-Tone Still Images - Requirements And Guidelines
  3. Patent Statement ( Memento of October 30, 2014 in the Internet Archive ) from jpeg.org
  4. jpegtran website
  5. Ubuntu - Information about package libjpeg-progs in jaunty (contains jpegtran)
  6. Sebastian Grüner: Compression - Dropbox makes Jpeg images almost a quarter smaller. Golem.de , July 15, 2016, accessed on July 19, 2016 .
  7. Comparison of the properties (including compression) of BMP GIF PNG JPEG TIFF PCX TGA , accessed on October 10, 2012
  8. maximumcompression.com
  9. JPEG - JPEG XL. Retrieved July 11, 2020 .
  10. How JPEG XL Compares to Other Image Codecs. Retrieved July 11, 2020 .
  11. jpeg.org
  12. Jump up Robert Obryk and Jyrki Alakuijala: Announcing Cookies: A New Open Source JPEG Encoder. Google Research Europe, March 16, 2017, accessed March 18, 2017 .
  13. Daniel Berger: Google's biscuit encoder shrinks JPEG images by a third. In: heise online. Heise Zeitschriften Verlag, March 17, 2017, accessed on March 18, 2017 .