Mixed raster content

from Wikipedia, the free encyclopedia

Mixed Raster Content (MRC) describes a technique for displaying the page content of a raster-based electronic document (scanned pages or artificially generated page images) for the purpose of image compression . The page content is broken down into various components by segmentation (e.g. text / line drawings, color images), each of which is displayed using adapted raster formats . The correspondingly coded areas are then placed on the page in different levels. The levels can have different sizes and resolutions and, under certain circumstances, also overlap. A bitonal (two-colored) plane is often used as a mask for a color plane, so that only the areas of the color plane punched out by the mask are visible.

advantages

If raster-based document pages, such as those created by a scanning process, are displayed without segmentation and different levels, then, as a rule, much larger amounts of data are created than when using MRC technology. A color DIN A4 page in 300  dpi requires approx. 24 MB of storage space without image compression . Lossless compression methods such as LZW or Packbits achieve only very low compression rates (order of magnitude 2: 1). Lossy compression methods such as JPEG or the lossy mode of JPEG 2000 can achieve high compression rates (order of 80: 1), but then often show visible artifacts in two-color text areas , since these methods are mainly designed for photo-realistic images. Techniques optimized for two-color text cannot be applied to color images.

The advantages of MRC technology lie in the coding of the different areas with methods tailored to their characteristics. This allows compression rates of the order of 200: 1 to be achieved with very good legibility and good visual quality. Depending on the quality of the segmentation and the raster formats used, rates of up to 400: 1 are not uncommon.

Codings used

The following methods are used to code the individual levels:

  • Colored content, images: JPEG or JPEG 2000
  • Black and white text, line drawings, or masks: Fax Group 4, JBIG 1, or JBIG2

File formats

In a narrower sense, MRC stands for the international standard ITU-T Recommendation T.44 or ISO / IEC 16485. The technology described there for displaying the page content in different levels is also found in JPM format (ISO / IEC 15444-6 JPEG 2000 Part 6) Application. The process can also be implemented with the Portable Document Format (PDF or PDF / A ).

example

Example of MRC segmentation

One possible type of page display with MRC is that a bitonal plane acts as a mask for a colored foreground and background image. Typically, the foreground and background images are stored in reduced resolution, while the mask retains full resolution in order not to impair the legibility of the text.

Web links

Individual evidence

  1. ITU-T Recommendation T.44
  2. ISO / IEC 15444-6 .