Image pyramid

from Wikipedia, the free encyclopedia
Visual representation of a picture pyramid with 5 levels.

An image pyramid is a form multi-scale representation signal developed in regions of the machine vision (engl. "Computer vision"), image processing and signal processing , in which several times to a signal or image smoothing and down-sampling is applied. The pyramid display is a preliminary stage to the scale space display (scale space display) and multi-scale analysis .

Creation of the pyramid

There are two types of image pyramids: low pass and band pass .

A low-pass pyramid is created by smoothing the image with a corresponding smoothing filter and then downsampling the smoothed image, usually by a factor of two along each coordinate axis. The same procedure is then applied to the resulting image and this cycle is repeated several times. Each cycle of this process produces a smaller image with higher smoothness but lower sampling density (hence lower image resolution). Illustrated, the entire, multi-scale representation looks like a pyramid, with the original image as the basis, on which the narrowing images resulting from each cycle are stacked on top of one another.

A bandpass pyramid is generated by detecting the deviations between adjacent resolution levels of the pyramid and using a type of image interpolation in order to calculate the differences pixel by pixel.

Convolution matrices for creating pyramids

A variety of convolution matrices are proposed for creating pyramids . Among the proposals, binomial convolution matrices that arise from binomial coefficients stand out as a particularly useful and theoretically well-founded class. The (normalized) binomial filter (1/4, 1/2, 1/4) is typically applied twice or along every spatial dimension in a two-dimensional image and then the image is downsampled by a factor of two. This operation is performed as often as desired, resulting in a compact and efficient multi-scale display. If needed for specific requirements, intermediate scaling levels can be generated, sometimes skipping the downsampling step, resulting in an oversampled or hybrid pyramid . With the increasing computing efficiency of processors available today , it is also possible in some situations to use Gaussian filters that are more widely used as a convolution matrix for smoothing when creating the pyramid levels .

Gaussian pyramids

In a Gaussian pyramid, successive images are weighted down by the mean value of the Gaussian distribution (Gaussian soft focus) and then scaled down. Each pixel contains the local mean of the pixel neighborhood of the pyramid level below. This technique is mainly used in texture synthesis .

Laplace pyramids

A Laplace pyramid is very similar to the Gaussian pyramid, but stores the difference image of the smoothed versions between each level. Only the smallest level is not a difference image, so that the high-resolution image can be formed from the difference images of higher levels. This method can be used in image compression .

Controllable pyramid

A controllable pyramid is an implementation of a multi-scale, multi-directional bandpass filter bank that is used for applications such as image compression, texture synthesis and object recognition . It can be thought of as a directionally selective version of the Laplace pyramid, in which, instead of a single Laplace or Gaussian filter, a filter bank of controllable filters is used at each level of the pyramid.

Areas of application of picture pyramids

Alternative representations

In the early days of machine vision ("computer vision"), image pyramids were the predominant way of calculating multi-scale representations from real images. The scale-space display is one of the newer techniques. Their popularity among researchers is based on their theoretical basis, the possibility of decoupling the downsampling phase from the multi-scale representation, the better tools for theoretical analysis and the possibility of calculating a representation on any desired scale and thus the algorithmic problems of image representation in to bypass different resolutions. Nevertheless, image pyramids are still often used to efficiently calculate approximations to the scale-space representation.

Detail manipulation

Laplace image pyramids, based on bilateral filtering , provide a good framework for image detail enhancement and manipulation. The difference images between each level are modified to increase or reduce details in different scales.

Some image compression methods use the Adam7 algorithm or other interlacing techniques. These can be seen as a kind of picture pyramid. Since these formats store "large-scale" parts of the image first and finer details later in the file, a viewer can quickly download a smaller preview image. So a file can support multiple viewing resolutions instead of storing or creating a separate image for each resolution.

See also

Individual evidence

  1. EH Andelson and CH Anderson and JR mountains and PJ Burt and JM Ogden. "Pyramid methods in image processing" . 1984.
  2. ^ PJ Burt: Fast filter transform for image processing . In: Computer Graphics and Image Processing . 16, May 1981, pp. 20-51. doi : 10.1016 / 0146-664X (81) 90092-7 .
  3. ^ A b James L. Crowley: A representation for visual information . In: Carnegie-Mellon University, Robotics Institute (ed.): Tech. report CMU-RI-TR-82-07 . November 1981.
  4. Burt, Peter and Adelson, Ted, " The Laplacian Pyramid as a Compact Image Code, " IEEE Trans. Communications, 9: 4, 532-540, 1983.
  5. JL Crowley, AC Parker: A representation for shape based on peaks and ridges in the difference of low-pass transform . In: IEEE Transactions on Pattern Analysis and Machine Intelligence . 6, No. 2, March 1984, pp. 156-170. doi : 10.1109 / TPAMI.1984.4767500 . PMID 21869180 .
  6. Crowley, JL and Sanderson, AC " Multiple resolution representation and probabilistic matching of 2-D gray-scale shape ", IEEE Transactions on Pattern Analysis and Machine Intelligence, 9 (1), pp 113-121, 1987.
  7. P. sea, ES Baugher and A. Rosenfeld "Frequency domain analysis and synthesis of image generating kernels," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol 9, pages 512-522., 1987
  8. Lindeberg, Tony, " Scale-space for discrete signals ," PAMI (12), no. 3, March 1990, pp. 234-254.
  9. Lindeberg, Tony. Scale-Space Theory in Computer Vision , Kluwer Academic Publishers, 1994, ISBN 0-7923-9418-6 (see specifically Chapter 2 for an overview of Gaussian and Laplacian image pyramids and Chapter 3 for theory about generalized binomial kernels and discrete Gaussian kernels)
  10. a b Lindeberg, T. and Bretzner, L. Real-time scale selection in hybrid multi-scale representations , Proc. Scale-Space'03, Isle of Skye, Scotland, Springer Lecture Notes in Computer Science, volume 2695, pages 148-163, 2003.
  11. Peter J. Burt and Edward H. Adelson. "The Laplacian Pyramid as a Compact Image Code" . IEEE Transactions on Communications. doi : 10.1109 / TCOM.1983.1095851 . 1983.
  12. Eero Simoncelli: The Steerable Pyramid . cns.nyu.edu.
  13. Roberto Manduchi, Pietro Perona, Doug Shy: Efficient Deformable Filter Banks (PDF) Caltech / University of Padua . 1997.
    So in Efficient Deformable Filter Banks . In: IEEE (Ed.): Transactions on Signal Processing . 46, No. 4, 1998, pp. 1168-1173.
  14. Stanley A. Klein; Thom Carney; Lauren Barghout-Stein and Christopher W. Tyler, "Seven Models of Masking," Proc. SPIE 3016, Human Vision and Electronic Imaging II, 13 (June 3, 1997); doi: 10.1117 / 12.274510
  15. Crowley, J, Riff O. Fast computation of scale normalized Gaussian receptive fields , Proc. Scale-Space'03, Isle of Skye, Scotland, Springer Lecture Notes in Computer Science , volume 2695, 2003.
  16. ^ DG Lowe: Distinctive image features from scale-invariant keypoints . In: International Journal of Computer Vision . 60, No. 2, 2004, pp. 91-110. doi : 10.1023 / B: VISI.0000029664.99615.94 .
  17. Photo detail manipulation via Image Pyramids