Computational steganography

Computer-aided steganography refers to processes that use steganographic techniques to hide data in carrier data accessible by a computer . The aim is to ensure confidentiality . This includes concepts such as credible deniability .

For carrier selection, it is necessary that there is noise in this data, i. H. naturally existing variations in data. Some data will tolerate more noise than others. Examples of carrier data that are changed steganographically with the aid of a computer are

Image and audio data have a comparatively high, tolerable noise component. A clever change in the noise is not noticeable and thus enables the establishment of a subliminal communication channel. Since it is possible to use cryptographic methods to transform any data into a form whose bit distribution resembles white noise (whitening), naturally occurring white noise is usually replaced with the encrypted ciphertext.

Modification of an image file

Image of a tree in which an additional (invisible) image of a cat is inserted using computer-aided steganographic methods

Image of a cat that was hidden in the two least significant bits of each pixel in the image above

Steganography on image data is relatively simple, since the human eye is considerably less sensitive to image noise than z. B. the ear to audio noise. A photo can be severely affected before the change is perceived as a disruption.

There are image formats with indexed colors; H. they are pallet-based (e.g. GIF , PCX ). Since the same color tones can be repeated with these palettes, it is easy to generate an image from differently indexed colors that appear monochrome on the outside.

Steganography methods that can be used on digitally available carrier image material include

Overwriting of the least significant bits by the signal to be hidden (LSB procedure).
Addition of reproducible pseudo-random sequences with a small amplitude, which is previously modulated with the information to be hidden (cf. CDMA technology).
Quantization of the pixels of the carrier image, for example rounding the color values according to the bit value to be embedded (QIM, quantization index modulation).
Hiding the information in the frequency domain after a frequency transformation.

In particular, simple methods produce modified carriers, the steganographic contamination of which can easily be shown by means of bar analysis methods , for example by means of detection by statistical methods. Most of the methods are also not robust against subsequently added noise, compression or rotation and scaling of the steganogram.

Illustration of steganography with a simple example

A secret message from Alice to Bob is to be transmitted steganographically in one image. Alice and Bob have already exchanged a secret password on a secure communication channel. Alice now selects an image as the carrier medium into which she integrates the secret message using the password.

Geheime Botschaft: G E H E I M N I S
Passwort:          A L I C E

Both Alice and Bob know the algorithm of how the message is embedded in the picture:

The image is an RGB image with 8 bits per pixel for each color channel and a corresponding total of 24 bits per pixel. The pixels in the image are first numbered line by line from top left to bottom right. A series of numbers from 1 to 26 is used periodically. This series of numbers can be mapped directly onto the 26 letters of the alphabet.

Only those pixels are used to embed the secret message, the number of which corresponds to one of the letters of the password, i.e. only the pixels with the numbers 1 (A), 3 (C), 5 (E), 9 (I) or 12 (L ). This selection ignores duplicate letters and the order within the password.

The secret message is embedded by replacing the last two bits per color channel of the RGB color values of the pixels selected in this way with the number of letters that result from the secret message. For example, to embed the first letter (G), the first selected pixel (pixel 1) is used. G is the seventh letter of the alphabet. 7 (decimal) corresponds to 111 (binary) or, since a total of 3 × 2 bits are required due to the three color channels, 000111. The following illustration visualizes the principle (X stands for the original bit assignment of the pixel).

Vor dem Einbetten:  XXXX XXXX   XXXX XXXX   XXXX XXXX
Nach dem Einbetten: XXXX XX00   XXXX XX01   XXXX XX11

Alice performs this replacement for each letter of the secret message.

Pixel number	1	3	5	9	12	27 (26 + 1)	29 (26 + 3)	31 (26 + 5)	35 (26 + 9)
Embassy	G	E.	H	E.	I.	M.	N	I.	S.
Position in the alphabet	7th	5	8th	5	9	13	14th	9	19th
Position binary (filled)	000111	000101	001000	000101	001001	001101	001110	001001	010011
red color value after coding	XXXX XX 00	XXXX XX 00	XXXX XX 00	XXXX XX 00	XXXX XX 00	XXXX XX 00	XXXX XX 00	XXXX XX 00	XXXX XX 01
green color value after coding	XXXX XX 01	XXXX XX 01	XXXX XX 10	XXXX XX 01	XXXX XX 10	XXXX XX 11	XXXX XX 11	XXXX XX 10	XXXX XX 00
blue color value after coding	XXXX XX 11	XXXX XX 01	XXXX XX 00	XXXX XX 01	XXXX XX 01	XXXX XX 01	XXXX XX 10	XXXX XX 01	XXXX XX 11

So that Bob can read the secret message, he selects the message-carrying pixels like Alice. He takes the two least significant bits from their color values and puts them together into 6-bit groups, each of which represents a letter of the secret message.

Evaluation of the approach

The change of each changed pixel in the resulting image is at most 4/256 = 1.56% per color channel, with an average of 0.78%. Since only selected pixels have been changed, it is unlikely that such a change can be perceived by humans.

The disadvantage of this steganographic approach is that the least significant bits of the changed pixels have a significantly different statistical distribution than a natural distribution, which has approximately the same number of ones and zeros. The most significant bit of the binary letter representation is always zero and the second most significant bit is almost always zero, as can be seen in the example. With sufficiently long messages and sufficiently small images, this difference is significant and can therefore be determined using statistical means.

Furthermore, the approach embeds in certain pixels, so that cropping the image at the top, left or right edge makes reading impossible. The approach does not survive lossy compression like JPEG either, since it is embedded in the least significant bits, the value of which is usually changed by the quantization during compression.

It can therefore be stated that there is a high probability that the perceptibility threshold will only be undershot in relation to humans, but should not be guaranteed in relation to already trivial bar analysis methods. The approach is not robust.

Modification of an audio file

Since audio data are inconspicuous and unsuspicious due to their omnipresence, they are particularly suitable for conveying messages. A message to be hidden is encoded into the audio signal , for example using spreading codes ( e.g. DSSS ), through which the change in the background noise is hidden. The signal-to-noise ratio remains.

The message hidden in this way can be retrieved from the audio file by correlation with knowledge of the specific spreading code . If the spreading code is not known, the existence of an embedding can usually neither be determined nor the hidden information obtained. Without knowledge of the spreading code, the additionally introduced signal cannot be distinguished from random background noise that is usually present and tolerated by the hearing.

Another example of audio steganography is the simple LSB replacement process. This applies in the time domain . It is based on audio data in the form of individual sample values , which are typically available in PCM format . The least significant data bits ( least significant bit , LSB) are replaced by the bits of a secret message. This procedure is not immune to interference: Even small changes to the format of the data, for example a change in the sampling rate , lead to the loss of the hidden message. The process does not survive lossy compression like MP3 or Ogg Vorbis .

Modification of the cluster fragmentation of a file system

It is possible to hide data in the fragmentation of files in a file system like FAT32 . This makes use of the fact that a file divided into clusters is usually subdivided and stored in different, non-consecutive (logical) locations when it is saved on the data carrier, if there is not a sufficiently large area for storage in one piece. The cluster size and number are manipulated in such a way that logical states result, which are mapped to binary ones and zeros.

Steganographic methods of this type have the disadvantage of not being very robust. The fragmentation is lost through defragmentation or even copying. In addition, a comparatively strong fragmentation on rarely used data carriers is unlikely and therefore an inherent problem for credible deniability .

Web links

Steganography

Erwin Schwendike: Steganography - Examples with image and audio files ( Windows )
Comparison of 18 steganography programs on Heise Security

Steganography software

Steghide ( Open Source , English)
Outguess - program for hidden embedding of information in image files
TechWhoop (Free Steganography Software for Windows, English)
VSL: Virtual Steganographic Laboratory (image files, open source , English)

Individual evidence

^ Fabien Petitcolas, Stefan Katzenbeisser: Information Hiding Techniques for Steganography and Digital Watermarking . Artech House, Boston, Mass. 2000, ISBN 978-1-58053-035-4
↑ Steganography through targeted hard disk fragmentation . heise security, April 26, 2011; Retrieved April 26, 2011

[1] Fabien Petitcolas, Stefan Katzenbeisser: Information Hiding Techniques for Steganography and Digital Watermarking . Artech House, Boston, Mass. 2000, ISBN 978-1-58053-035-4

[2] Steganography through targeted hard disk fragmentation . heise security, April 26, 2011; Retrieved April 26, 2011