Predictive coding
The predictive coding is a special form of coding that significantly based on forecasts and assumptions.
Using the (known) data that has already been read, an attempt is made to make a statement about the coming data. If you know, for example, that an object is moving in the picture, you can try to predict where it is, or parts of it, will be in the next pictures. The difference between the prediction and the true picture is stored. In addition to video compression methods, methods for audio and image compression also use predictive coding.
The advantage here is that if the forecast is good, the deviation from the real values is only very small (close to zero). The values to be stored are therefore very small or similar and can easily be further reduced by a subsequent compression (e.g. run length coding or entropy coding ).
Predictive coding is therefore not a data reduction (the amount of information remains the same), but a transformation of the values into a form that is a better starting point for other coding methods.
However, it has the disadvantage of increasing decoding effort, since the prediction must also be made here in order to correct it for the deviation and thus to calculate the output value. Furthermore, the encoder and decoder are dependent on one another; it is not possible to change the type of prediction in the encoder without also having to adapt the decoder.
Due to the relative reference to previous values, the predictive coding periodically requires absolute support points if a video is used for streaming or if coils are to be possible. The support points are so-called I-pictures , which are completely saved and on the basis of which the following pictures can be predicted.
Types of forecast for image compression
Very simple prediction algorithms for images predict a (possibly weighted) mean from already known pixels for the current pixel. Since most raster image formats store the image data line by line from top to bottom, per line from left to right, the known pixels (in the decoder) are on the left and above the current pixel. The PNG file format knows four different so-called pre - filters . Depending on the image content, a different filter will give a better (i.e. smaller after compression) result. Since the filters can change for each image line, there is a large number of filter combination options. An “optimizer” for PNG files can try them all out in order to determine the smallest file size. Most programs that can write PNG files, however, use a fixed prefilter for the entire image. This is either fixed, depending on the color depth, or is determined from the image content using a simple heuristic.
Types of forecast for video compression
- context-free prediction (motion prediction)
Here the prediction takes place exclusively by determining changes in brightness in the image without semantic information; it is Z. B. regardless of whether a car or a ball is filmed, only the change in brightness of the movements is important.
- model-based prediction (motion prediction)
Based on the anticipation of the displayed image content. A good example is video telephony, in which the picture usually only changes insignificantly, since the background remains the same and the person making the call moves little.
- object and region-based prediction (motion prediction)
The most complex type of prediction is segmentation / object recognition and tracking. Here individual objects are (semantically) recognized and the movement is predicted over several images (e.g. ball rolls past the camera).
- Camera movement (motion compensation)
Often times, a picture changes largely due to the fact that the camera moves in a straight line (filming a passing road through the side window of a car), rotating or using a zoom. In all cases, a great deal of image content can be reused by retaining the image content and only shifting it using motion vectors.