Digital image forensics

from Wikipedia, the free encyclopedia

Digital image forensics is a sub-discipline of digital multimedia forensics and is dedicated to the investigation of the authenticity of digital images, among other things, to obtain evidence in criminology (cf. forensics ). Most of the image forensic procedures that are relevant in practice are “blind” procedures; In other words, they do not require any access to an original image that may be present, but rather obtain clues solely through an analysis of the image data itself.

aims

Methods of digital image forensics are used to determine the origin of the image and to detect manipulations of digital image data. The starting point is usually a digital image to be examined , the properties of which are analyzed using statistical methods in a mostly (semi-) automated process. Most of the methods do not assume any knowledge of a possibly existing (analog or digital) original image.

In order to examine the authenticity of an image, the available image data are checked with regard to various features:

  • Existence of characteristics of the image input device
  • Non-existence of traces of image processing operations

The article on multimedia forensics contains a fundamental discussion on the origin and use of such features .

Device characteristics exploited by digital image forensics using the example of the digital camera (schematic structure).

Determination of the image origin

An important aspect when examining the authenticity of digital image data is the question of the origin of an image. Digital image forensics techniques attempt to establish a connection between an existing image file and the image input device used ( scanner , digital camera, etc.).

In general, methods for determining the image origin are based on the characteristics of the image input device. If features specific to a device can be detected in an image, this can serve as an indication of the origin of the image. Depending on the application and the nature of the features, there are different degrees of detail when determining the origin. Typical scenarios are the distinction between computer-generated and natural images, the determination of the device class and the identification of the device model through to the recognition of the specific device. In principle, all methods for determining the origin of an image make the assumption that images of the same origin have very similar properties and can thus be statistically differentiated from images of other origins.

Computer generated images vs. natural images

In principle, the first question that arises is whether a given image was generated entirely by a computer or whether it represents a section of reality recorded with a sensor . In contrast to natural images, computer-generated images arise entirely from the imagination of the author and may therefore be assessed differently with regard to their content. In order to make a distinction, the assumption is made that the process for creating computer-generated images cannot completely reproduce the complex recording process inside an image input device (or does not pursue this as a primary goal). Typical approaches derive their indications from an analysis of the noise properties of the image or the relationships between neighboring pixels.

The problem of differentiating between computer-generated and natural images is discussed in connection with the persecution of child pornography, particularly in the USA. Images with child pornography content that are completely computer-generated are treated differently there under criminal law.

Device class

The assignment of an image to a class of input devices aims to differentiate between basic recording concepts. A possible use case is the distinction between images from digital cameras and flatbed scanners . This makes use of the elementary differences in the structure of the various devices. Typical features are based on the use of different sensor architectures ( area sensor for cameras vs. line sensor for flatbed scanners). Images recorded with a line sensor generally have different noise characteristics than those that come from an area sensor. In addition, color interpolation is usually not necessary in the flatbed scanner , which leads to characteristic dependencies between neighboring pixels in digital camera images.

Device model

Individual device types can be further subdivided into different models. Since devices of the same model consist of structurally identical components, images that were recorded with a specific model can be assigned to it due to similar properties. In addition to the search for the device model, the identification of the device manufacturer may already be relevant.

Clues for determining the device model can be derived from almost all components of an image input device, for example:

  • Different lens systems lead to aberrations of different degrees.
  • With the use of different sensors , the noise characteristics vary systematically between images of different models.
  • The structure of the color filter array (e.g. Bayer sensor ) and the color interpolation algorithm used lead to model-specific dependencies between neighboring pixels.
  • The implementation of further device-internal processing steps for processing the color image (e.g. white balance ) result in systematic similarities between the individual color channels.
  • The use of many different quantization tables also leads to differences in the output JPEG files.

The determination of the device model is usually understood as a classification problem of pattern recognition , in which individual models each correspond to a class. Since non-linearly separable classes are generally assumed, support vector machines are often the method of choice. The dimensionality of the feature space is often very high, so that methods for reducing the feature vectors are increasingly being used .

Specific device

The aim of identifying a specific device is to be able to distinguish between identical image input devices (i.e. devices of the same model). One of the first specialist publications in the field of image forensics was already devoted to this problem and suggested using defective sensor elements as identification features. A similar approach is based on the deposition of dust particles on the sensor of digital SLR cameras, which lead to camera-specific artifacts in the image. Studies show that these represent a suitable identification feature even in spite of automatic sensor cleaning .

According to current knowledge, the most reliable and best-researched method for determining the image origin is based on the CCD / CMOS sensor noise from typical image input devices and was developed in the group around Jessica Fridrich . The method is based on the assumption that each sensor element reacts slightly differently to incoming light, which leads to a systematic noise component in the recorded image. This is comparatively stable over several recordings of a device, but varies in images of different origins. The characteristic noise component (the so-called photo-response non-uniformity, PRNU) can be estimated from the image using a suitable noise filter. The device used for recording can then be determined via a correlation or maximum likelihood detector by comparing the estimated noise signal with known reference noise patterns. The characteristic sensor noise can also be detected in lossy compressed images and, under certain circumstances, even survives an analog conversion (e.g. printing ) with subsequent re-digitization .

Performance and recognition rates

In principle, the lowest possible false acceptance rate is always desirable when determining the image origin . In practical applications, the risk of an image being incorrectly assigned to a device that was not involved in its creation should be minimized.

By far the most reliable origin assignment of digital images can be achieved with the sensor noise. In a large-scale test with more than a million images from over 6800 different digital cameras (150 models in total), almost 98% of all images could be assigned to their correct origin with a false acceptance rate of 2.4 × 10 −5 .

The extent to which such high detection rates can also be achieved under practical conditions for determining the type and model of the device must largely still be viewed as an open research question . Results reported in the literature are mostly based on comparatively small data sets that do not allow generalization. In general, however, the detection rates for the determination of type or model have so far been below those of the technology based on sensor noise for determining a specific device.

Detection of image manipulation

In addition to determining the origin of the image, the detection of manipulations of digital image data is the second central objective of digital image forensics. Both the inconsistent occurrence (or lack of) of device characteristics and the presence of image processing artifacts can be exploited here.

Consistency of device characteristics

The extent of imaging errors (here chromatic aberration in the form of reddish color fringes) depends on the image position. If image areas are copied / pasted, this can lead to inconsistencies.

Every image input device leaves characteristic traces in the images it takes. Assuming that typical image processing operations (e.g. in Photoshop , GIMP etc.) influence the development of these characteristics, an image can be examined with regard to the (consistent) existence of suitable device characteristics. If the features to be expected cannot be (consistently) demonstrated in the image, this can be interpreted as an indication of manipulation.

A large number of different approaches exist for the detection of image manipulations based on missing or inconsistent device characteristics, which are better or worse suited depending on the situation:

  • The extent of imaging errors is generally dependent on the position in the image and increases in strength with increasing distance from the optical center . If an image section is copied or pasted from another image while ignoring this characteristic, this leads to verifiable inconsistencies.
  • Since each sensor element has its own noise characteristic, the sensor noise is also dependent on the image position. If the device with which an image was recorded is known, a local consistency of the sensor noise can be checked. The device-specific noise is missing in image areas that have been (too heavily) processed or come from completely different images.
  • Assuming that an image was recorded with a one-shot sensor , the (consistent) presence of color interpolation traces can be checked. Interpolated images have characteristic dependencies between neighboring images that are weakened or removed by post-processing.
  • The blocking artifacts caused by JPEG compression can also provide useful information about image manipulation. If a section from an uncompressed image is inserted into a JPEG image (or vice versa), the expected 8 × 8 blocks are missing, which can still be statistically proven even with very high JPEG quality. Conspicuous traces also arise if the existing block structure is not taken into account when inserting JPEG-compressed image sections into an already compressed image. i.e., a shift in the block boundaries occurs. Even if the block structure is observed, the insertion of image sections that have been compressed with a different quantization table can lead to traces of evidence.

Artifacts from image editing

Interpolation artifacts using the example of enlarging a 2 × 2 pixel block by a factor of 2 ( bilinear interpolation ). Each interpolated pixel is a linear combination of its immediate neighbors. Since an enlarged image consists of many such blocks, the geometric transformation can be proven statistically.

In addition to missing or inconsistent device characteristics, traces of the image processing operation can also be used for forensic analyzes. In this case, the presence of certain features is used as an indication of possible manipulation.

A recognition of image manipulations on the basis of processing artifacts has the advantage over methods based on device characteristics that no assumptions have to be made about the recording device. Typical detectable artifacts arise, for example, from:

  • Geometric transformations to adjust the size and shape of images or parts thereof and the associated interpolation . If an image is geometrically transformed, missing information at the resulting gaps in the image grid must be calculated by interpolation from the existing pixels in the original image. This leads to spatial periodicity in the dependencies between neighboring pixels, which can be verified with statistical methods.
  • Copy & paste operations for retouching image sections. If a section within the picture has been copied (e.g. with a copying brush ), this can be determined by searching for duplicate parts of the picture. In order to be able to detect slight deviations between the copied areas, it is not the pixel values ​​per se, but a transformed representation ( DCT , PCA , ...) that are compared.
  • Recompression when repeatedly saving a lossy compressed image. If a JPEG image is saved again in JPEG format (e.g. after processing), this can lead to detectable traces in the DCT coefficients if the second compression uses a different quantization table.

When is a fake a fake?

Image forensic methods cannot answer the semantic question of whether an image is a fake. At most you can objectively determine whether an image has been processed in any way. In order to be able to recognize forgeries as such, it must first be clear what is meant by a forgery. In practice, however, such a delimitation is often not trivial. A central question is whether (and which) content-changing operations are allowed. In general, it can also be assumed that almost every published image has been post-processed in some way ( color correction , ...). For this reason, many newspapers and magazines have formulated their own guidelines for image editing for image editors to make things clearer .

Limits

Although image forensic methods represent a promising approach to checking the authenticity of digital images, there are still some obstacles to be overcome for their widespread use in practice.

For the majority of all methods, the analysis of lossy compressed image data is a major challenge (apart from the methods that are based directly on compression artifacts). Often subtle device characteristics or traces of manipulation are blurred by excessive compression. This is all the more of a problem since the JPEG format is probably the most widely used file format for storing digital images.

A general problem is the comparatively high test effort involved in evaluating image forensic procedures. Due to the high complexity and poor modeling ability of typical image data, the reliability can only be estimated empirically. Creating comprehensive and representative test data sets is very time-consuming, however, so that error rates currently reported in the literature are often not very meaningful.

In addition, the evidential value of evidence obtained with image forensic procedures is currently difficult to assess in court, since reports on legal practice are only available in anecdotal form, if at all. In any case, it can currently be assumed that image forensic analyzes in the form of expert reports are included in the evidence assessment and are therefore relatively time-consuming and expensive.

See also

literature

  • Oliver Deussen: Image manipulation: How computers distort our reality . Spectrum Akademischer Verlag, Berlin / Heidelberg 2007, ISBN 978-3-8274-1900-2 , Chapter 7 (Technically recognizing image manipulation: digital forensics).
  • Hany Farid: Image forgery detection . In: IEEE Signal Processing Magazine . Vol. 26, No. 2 , March 2009, ISSN  1053-5888 , p. 16-25 , doi : 10.1109 / MSP.2008.931079 .
  • Judith A. Redi, Wiem Taktak, Jean-Luc Dugelay: Digital image forensics: a booklet for beginners . In: Multimedia Tools and Applications . Vol. 51, No. 1 , January 2011, ISSN  1380-7501 , p. 133-162 , doi : 10.1007 / s11042-010-0620-1 .
  • Andrea Trinkwalder: Digital Image Forensics: Algorithms Hunt Forgers . In: c't - magazine for computer technology . August 18, 2008, ISSN  0724-8679 , p. 152-156 .

Web links

Individual evidence

  1. Siwei Lyu, Hany Farid: How realistic is photorealistic? In: IEEE Transactions on Signal Processing . Vol. 53, No. 2 , February 2005, ISSN  1053-587X , p. 845-850 , doi : 10.1109 / TSP.2004.839896 .
  2. A. Emir Dirik, Sevinç Bayram, Husrev T. Sencar, Nasir Memon: New features to identify computer generated images . In: ICIP 2007 . Vol. 4, October 2007, pp. 433-436 ( isis.poly.edu [PDF; 279 kB ]).
  3. ^ Hany Farid: Creating and detecting doctored and virtual images: Implications to the child pornography prevention act . September 2004 ( cs.dartmouth.edu [PDF; 4.7 MB ]).
  4. Nitin Khanna, George T.-C. Chiu, Jan P. Allebach, Edward J. Delp III: Forensic techniques for classifying scanner, computer generated and digital camera images . In: ICASSP 2008 . March / April, 2008, p. 1653-1656 , doi : 10.1109 / ICASSP.2008.4517944 .
  5. Christine McKay, Ashwin Swaminathan, Hongmei Gou, Min Wu: Image acquisition forensics: Forensic analysis to identify imaging source . In: ICASSP 2008 . March / April, 2008, p. 1657-1660 , doi : 10.1109 / ICASSP.2008.4517945 .
  6. a b c Tomáš Filler, Jessica Fridrich, Miroslav Goljan: Using sensor pattern noise for camera model identification . In: ICIP 2008 . October 2008, p. 1296-1299 , doi : 10.1109 / ICIP.2008.4712000 .
  7. Sevinç Bayram, Husrev T. Sencar, Nasir Memon: Classification of digital camera-models based on demosaicing artifacts . In: Digital Investigation . Vol. 5, No. 1–2 , September 2008, ISSN  1742-2876 , pp. 49-59 , doi : 10.1016 / j.diin.2008.06.004 .
  8. Mehdi Kharrazi, Husrev T. Sencar, Nasir Memon: Blind source camera identification . In: ICIP 2004 . October 2004, p. 709-712 , doi : 10.1109 / ICIP.2004.1418853 .
  9. ^ Hany Farid: Digital image ballistics from JPEG quantization: A followup study . December 2008 ( cs.dartmouth.edu [PDF; 264 kB ]).
  10. Kenji Kurosawa, Kenro Kuroki, Naoki Saitoh: CCD fingerprint method - identification of a video camera from videotaped images . In: ICIP 1999 . October 1999, p. 537-540 , doi : 10.1109 / ICIP.1999.817172 .
  11. A. Emir Dirik, Husrev T. Sencar, Nasir Memon: Digital single lens reflex camera identification from traces of sensor dust . In: IEEE Transactions on Information Forensics and Security . Vol. 3, No. 3 , September 2008, ISSN  1556-6013 , p. 539-552 ( isis.poly.edu [PDF; 1.6 MB ]).
  12. a b Jessica Fridrich: Digital Image Forensics . In: IEEE Signal Processing Magazine . Vol. 26, No. 2 , March 2009, ISSN  1053-5888 , p. 26-37 , doi : 10.1109 / MSP.2008.931078 .
  13. Miroslav Goljan, Jessica Fridrich, Jan Lukas: Camera identification from printed images . In: Edward J. Delp III a. a. (Ed.): Security, Forensics, Steganography, and Watermarking of Multimedia Contents X . SPIE Vol. 6819, January 2008, 68190I, doi : 10.1117 / 12.766824 .
  14. Miroslav Goljan, Jessica Fridrich, Tomáš Filler: Large scale test of sensor fingerprint camera identification . In: Edward J. Delp III a. a. (Ed.): Media Forensics and Security XI . SPIE Vol. 7254, January 2009, 72540I, doi : 10.1117 / 12.805701 .
  15. Micah K. Johnson, Hany Farid: Exposing digital forgeries through chromatic aberration . In: MM & Sec'06 . ACM Press, New York September 2006, pp. 48-55 , doi : 10.1145 / 1161366.1161376 .
  16. ^ Alin C. Popescu, Hany Farid: Exposing digital forgeries in color filter array interpolated images . In: IEEE Transactions on Signal Processing . Vol. 53, No. October 10 , 2005, ISSN  1053-587X , p. 3948-3959 , doi : 10.1109 / TSP.2005.855406 .
  17. Ramesh Neelamani, Ricardo de Queiroz, Zhigang Fan, Sanjeeb Dash, Richard G. Baraniuk: JPEG compression history estimation for color images . In: IEEE Transactions on Image Processing . Vol. 15, No. 6 , June 2006, ISSN  1057-7149 , p. 1365-1378 , doi : 10.1109 / TIP.2005.864171 .
  18. Weihai Li, Yuan Yuan, Nenghai Yu: Passive detection of doctored JPEG image via block artifact grid extraction . In: Signal Processing . Vol. 89, No. September 9 , 2009, ISSN  0165-1684 , p. 1821–1829 , doi : 10.1016 / j.sigpro.2009.03.025 .
  19. ^ Hany Farid: Exposing digital forgeries from JPEG ghosts . In: IEEE Transactions on Information Forensics and Security . Vol. 4, No. 1 , January 2009, ISSN  1556-6013 , p. 154–160 , doi : 10.1109 / TIFS.2008.2012215 .
  20. ^ Jan Lukáš, Jessica Fridrich, Miroslav Goljan: Digital Camera Identification from Sensor Pattern Noise
  21. ^ Matthias Kirchner: Fast and reliable resampling detection by spectral analysis of fixed linear predictor residue . In: MM & Sec'08 . ACM Press, New York September 2008, pp. 11-20 , doi : 10.1145 / 1411328.1411333 .
  22. Sevinç Bayram, Husrev T. Sencar, Nasir Memon: A survey of copy-move forgery detection techniques . 2008 ( isis.poly.edu [PDF; 528 kB ]).
  23. ^ Alin C. Popescu, Hany Farid: Statistical tools for digital forensics . In: Information Hiding 2004 . LNCS 3200. Springer Verlag, Berlin / Heidelberg 2004, ISBN 978-3-540-24207-9 , pp. 128-147 , doi : 10.1007 / b104759 .
  24. ^ Alfred Büllesbach : Digital image manipulation and ethics. Current tendencies in photojournalism . In: Elke Grittmann u. a. (Ed.): Global, Local, Digital - Photojournalism Today . Herbert Von Halem Verlag, Cologne 2008, ISBN 978-3-938258-64-4 , p. 128–147 ( lmz-bw.de [PDF; 198 kB ]).
  25. Michael Knopp: Digital photos as evidence . In: Journal for Legal Policy . tape 41 , no. 5 , 2008, ISSN  0514-6496 , p. 156-158 .