Deinterlacing

from Wikipedia, the free encyclopedia

Deinterlace (engl. Deinterlacing ) denotes a process in which images of the interlaced present video signal in the frames to be converted.

This is fundamentally necessary if the recording camera works with interlacing and the recording camera and display screens have different temporal and vertical image structures. In addition to 100 Hz televisions, this includes all non-tube televisions, i.e. liquid crystal and plasma screens . In addition to direct display, rear projection screens and front projectors also work with this technology.

Deinterlacing is always necessary even if television programs or video DVDs recorded with the interlaced method are to be viewed on all types of computer monitors (apart from the now outdated video monitors on old home computers ). Only conventional 50 Hz tube televisions and 50 Hz tube projectors can do without deinterlacing. The disentangling can be done either in the television set itself or in the set-top box (DVD player, DVB receiver, etc.) delivering the signal . On the computer, deinterlacing is performed either by software (e.g. DVD player software) or on the hardware level (e.g. TV card). The image quality depends crucially on the deinterlacer used.

Interlace signals

For historical and technical reasons, all 50 and 60 Hz tube television sets use the interlacing method, in which no full images ( frames ) but fields ( fields ) are displayed. Each field consists of only half of the picture lines of a full picture. A field with the odd image lines ( odd or top field ) and one with the even image lines ( even or bottom field ) are displayed alternately . The interlace method was originally introduced in the early days of television in order to ensure a halfway flicker-free picture with the state of the art at the time. Nowadays, however, this method is a real problem because it is unsuitable for modern screens ( LCD , plasma , DLP ) and affects the image quality. To date, for reasons of compatibility, practically all standard definition television and video signals have not transmitted full images, but rather half images. With the PAL standard common in Germany , for example, it is not 25 full frames but 50 fields per second. Such a signal is called "interlaced" ( woven ).

In the case of interlaced signals, a distinction must be made between two types of sources: On the one hand, film and, on the other hand, video recordings.

In the production of films, film cameras are used that record full images (usually at 24 Hz). These recordings are primarily intended for the cinema, where full images are also shown. For TV transmission, such film recordings have to be broken down into fields in order to generate the necessary interlaced signal. Every two successive fields here go back to one and the same frame or have the same time index. Such a signal is also referred to as progressive with segmented frames (psF).

The situation is completely different with video recordings that were produced with TV cameras for television. TV cameras work according to the interlace method and record fields. So you generate an interlaced signal directly. Since one field is recorded first and then the other field, two consecutive fields have different time indices. With PAL there is a time delay of 0.02 seconds between two fields. ( See also: Moving Images ).

When it comes to deinterlacing, there is an essential difference between cinema and TV material: TV material consists of 50 different individual images per second (PAL); with continuous movement, each of these fields shows a different “snapshot”. In the case of cinema material that is shown in PAL format ( 2: 2 pull-down ), two successive fields each come from the same frame. This means that on the one hand, cinema material can theoretically be perfectly deinterlaced (full images can be clearly derived), but on the other hand, the movements are less fluid because there are actually only half as many “snapshots” of the movement. Cinema material therefore requires a different type of filtering in order to appear harmonious after deinterlacing.

Methods

A number of different deinterlacing methods are used today. Some of these differ considerably in terms of the effort involved. In some cases, findings from artificial intelligence are even used. The most important procedures are described in more detail below.

Weave ( field insertion )

The easiest way to deinterlace interlaced image material is to display the existing fields at the same time, that is, to superimpose them. The even lines of one field and the odd lines of the other field result in a full image. However, this only works without loss of quality in the case of film material that consists of fields from the same recording time. In this case, it is only necessary to ensure during the process that only the appropriate fields are combined. However, if the fields differ in time (TV material), comb-like artifacts arise because the contents do not match. The lines of one field appear shifted compared to the lines of the other field. The change between the individual fields and thus the comb effects are stronger the more movement there is in the scene. At the end of the process, a full image was created from two fields. If you were to display this now, you would see a clear flicker. This would only result in a frame rate of 25 Hz. The full frames are therefore displayed twice in order to achieve a rate of 50 Hz again. Weave is therefore a very simple process and has the decisive disadvantage of comb artifacts. Weave is therefore unsuitable for TV recordings and other deinterlacing processes are required. However, many deinterlacers use input signals in which the fields have already been joined using Weave, and process them further.

In short: merge fields into full image. Unsuitable for TV material!

Blur

When using blur, the full image is created using a process similar to the weave technique. The two fields are also merged, but the resulting full image is softened again before it is displayed. This is an attempt to weaken the comb effect, but this also leads to a clearly blurred starting material.

In short: merge fields and soften the result.

Skip Field

It can now be seen that the most important thing is to eliminate comb artifacts. Therefore one tries to create the full picture from only one field. The other field is simply dropped. However, this means you lose the full resolution of the original and then only have a picture that is half the size. For this reason you have to adjust the picture again to the old size.

Skip field with line doubling or interpolation

If you do not want to enlarge the image afterwards, Skip Field can also calculate the full image by simply doubling the lines. With this choice you get a poor quality result. That is why the missing lines are obtained by means of interpolation . The easiest way is to determine a missing line from the two surrounding lines. If you add more lines to the interpolation, the result is better, but the computational effort also increases. At the end, the resulting image is displayed twice again to prevent flickering. The big problem with the "skip field" technique is that movements appear clearly choppy, since a field is simply left out and there is ultimately a lack of image information. In addition, horizontal details are missing, which are so small that they only appear in one field at a time. However, the method has the advantage that there are no comb effects. Overall, the end result looks softer than the original because you have to extrapolate the image or because the missing lines never match the original material, even with good interpolation.

In short: leave out even or odd lines, then get a full screen.

Bobbing ( line averaging )

With bobbing, each field is expanded to a full image. However, no field is left out, as is the case with Skip Field Video. So you determine the missing lines of the odd and even fields and thus get two full images. Now you first show the first, then the second full screen. The first and last lines of the fields are difficult to interpolate, however, since there is no neighboring line below or above from which information for reconstruction could be drawn. If these lines are then not calculated, there is an up (first line missing) and down (last line missing) in the display when switching between the full screens. Just like with Skip Field Video, with this method the result appears soft and horizontal details may also be missing. Comb artifacts also do not appear. As an improvement, the method offers fluid movements, as no field is left out. In addition, the frame rate remains at 50 Hz, namely 50 fields become 50 full images per second. The eponymous disadvantage of bobbing is the vertical wobbling.

In short: interpolate the missing lines from field 1, do the same for field 2. Play the two frames obtained one after the other.

Blending

Blending or averaging works in a similar way to bobbing. The frames are obtained by expanding fields. This is done by simply doubling lines or by interpolation. The difference is that with blending not all generated full images are displayed individually one after the other. Once both full images have been created, they are superimposed and their mean value is calculated, so that a better result is achieved than with the simple, only spatial interpolation, since the temporal dimension is also included (3D interpolation due to the use of the two spatial dimensions x and y , as well as the time dimension z , whereby the duration of the respective time unit used is fixed at 2 × 50ths = one 25th of a second). With this method, it is also possible in a modified form to apply it only to certain areas (e.g. where comb artifacts are particularly pronounced). The final image is reproduced twice again to avoid flicker. The advantage of blending is that there is no tremor, which is typical for bobbing. However, by blending the two images, moving structures are blurred.

However, this corresponds to the motion blur with longer exposure times , i.e. with the progressive image speeds 24p and 25p with a correspondingly lower shutter speed that cannot be achieved with 50i (see rotating aperture ). For this reason, blending is often used as a quick, simple method to make hectic, unsightly video movements (the dreaded so-called shutter effect ) at least balanced during playback and aesthetically look like on film. In the still image, however, the result is not completely identical to a naturally generated lower exposure time, since the result is a double image, especially with fast movements (i.e. not exactly identical to natural motion blur), which the human eye does not notice when playing back at 25 fps .

This method is only suitable for deinterlacing original 50i video material. Scanned film material, on the other hand, which was recorded with 24 full frames, but scanned amateurishly without weaving (see above), i.e. with 50 fields with (on the computer) visible field lines, also loses half of its (effective) frames per second and thus shows that there Due to the recording method, there was never more information than 25 fps, in the end only 12.5 (effective) images per second. This is not enough to maintain an illusion of movement, whereby the added additional motion blur is also very noticeable here. In the case of scanned film material that has half-image lines, spatial interpolation with skip field alone is therefore preferable.

In short: interpolate the missing lines from field 1, do the same for field 2. Place the two full images obtained on top of one another and determine the mean value. The result: 50 fields become 25 full images.

Adaptive

Adaptive deinterlacing is the most developed and complex method. The difference to the deinterlacing methods described above is that with this method, the preceding and following fields are also included for processing a specific field. First of all, a detailed movement analysis is carried out. Parts of the field in which no or only negligible movements were detected can then be supplemented with a simple weaving without fear of comb artifacts. This avoids the disadvantages of bobbing (trembling) or blending (blurring). For moving parts of the picture, however, another method must be selected. The deinterlacer will try to recognize moving picture elements and to reconstruct them from other fields with as little loss as possible. The more previous or following fields are included in this process, the better the result to be expected. Of course, this also increases the computational effort. In addition, with each subsequent field that is taken into account when processing the current one, the image output is delayed by 0.02 seconds (with PAL), because after all, these images must first be "waited for". If the sound is not delayed accordingly, the image and sound run asynchronously, which is not noticeable in the usual context. The deinterlacer only has to interpolate moving picture elements that could not be reconstructed. Again, different methods can be used for this.

In summary, it can be said that adaptive deinterlacing ideally delivers the best result: You get full images in very good image quality and at full frame rate. However, adaptive deinterlacing also has a number of disadvantages. As already mentioned, the process is very computationally intensive. Software deinterlacers therefore need a very fast system in order to be able to work properly. Corresponding hardware deinterlacers are expensive. Whether the additional price justifies the quality gain remains questionable. In addition, it is very difficult to develop a reliable adaptive deinterlacer because the algorithms required are complex. Mediocre or even faulty adaptive deinterlacers often add so many disruptive artifacts into the image that the image quality is poor. After all, the quality that an adaptive deinterlacer delivers depends crucially on the quality of the starting material. A good result can only be achieved with high-quality image signals. Image disturbances such as “noise” or “grizzling” can quickly throw high-quality deinterlacers off course. The result is again strong image artifacts that impair the image quality. In this case, a simple bobbing or blending is often better.

Motion compensation

The most modern deinterlacing methods use motion compensation . The fields are combined with weave. Since movements with Weave would result in comb structures, the moving picture elements are identified first. Then the attempt is made to match the parts of field 1 with the equivalents from field 2 and only then to combine them with Weave. The process is very complex, but has now become state of the art for televisions. This method is usually combined with adaptive filters, so that a very good overall impression is created. Methods that work with motion compensation provide the best result of all variants. Only cheap LCD televisions based on PC monitor technology still use uncompensated deinterlacing.

In short: combine fields with Weave, bring moving parts of the image together and then combine them with Weave.

illustration

Interlaced Animation.gif


Deinterlacing at image speeds other than 25p / 50i (including NTSC)

The types of deinterlacing described here primarily apply to native 50i PAL video material and 24p film material accelerated to 25p or 25i via PAL speed-up (see film scanner ) during scanning. Things are a little more complicated with NTSC material and analogue film material with film speeds other than 24 fps.

Native NTSC 29.97i to NTSC 29.97p

If you want to convert native NTSC video material into full images by means of blending, the slightest difficulties arise, since the same procedure can be used here as with PAL without any problems.

Film on NTSC

There is no simple method for scanning film material at 24 fps on NTSC, as offered by the PAL Speed-Up ; a corresponding acceleration to 29.97 fps would simply be too noticeable. Instead, the material has to go through a complicated interlacing process, the so-called pull-down (see film scanner ), since 24 cannot be divided by 29.97, for example by repeating different individual images. If the material remains in NTSC format, it must remain interlaced, otherwise the fluidity of the movements would also be lost during weaving.

However, such material can be converted into PAL 25p using inverse telecining (sometimes also referred to as pull-up ). After inverse telecining (possible for home users using the VirtualDub freeware tool via its frame rate option), you first get a file with progressive material with a frame rate of 23.976 fps; this is simply subjected to the usual PAL speed-up by accelerating the image and sound to 25 fps, whereby the resolution and pixel aspect ratio are also adjusted by enlarging the image somewhat according to the PAL aspect ratio.

Native NTSC 29.97i to PAL 25p

The inverse telecining method should also be used here .

Other film speeds ( pull-down for PAL)

Occasionally, especially with historical and amateur films, one comes across film speeds other than 24 fps. Silent films before the invention of the sound film were often recorded at a standardized but slower speed (spring mechanisms often at 12–16 fps); Normal 8 and Super8 ran or run in the tone variant mostly at 16 fps (Normal8) and 18 fps (Super8).

As with the pull-down for NTSC, such image speeds can not be divided by 24, 25, 29.97 or 50. To this day, both NTSC and PAL can often be seen how such films are often scanned with the usual 24 to 25 fps, which is the reason why the movements on such material today often appear choppy and too fast. However, it is also possible to pull down this material with the introduction of interlacing, so that the original speed is retained with both PAL and NTSC. This is already standard in the professional scanning of Normal8 and Super8 for private customers, but this is often not yet done with historical film material on 16 mm and 35 mm, as can still be seen very often today in historical documentaries.

The objection that effective image speeds below 24 fps cannot be expected of the viewer does not apply; First of all, anyone who has already seen such a flicker-free projection can confirm this for 16–18 fps with Normal8 and Super8 (flicker in the form of light-dark fluctuations is at best produced by simply filming the screen with a video camera). Secondly, there are no problems for the modern viewer even below 16 fps, since the darkness between the television images does not lengthen and, as in the majority of modern cartoons and animated films, it is easy to see, which are significantly less than 20 fps. s get by. The BBC therefore used such a pull-down process for their mammoth documentary People's Century , which was published on the occasion of the turn of the millennium , whereby a lot of historical film material was shown for the first time in its original, natural speed.

Here, however, the same dilemma arises for PAL as in general for film material on NTSC; Through simple deinterlacing, whether through weaving or blending, individual images are lost and the material becomes very jerky (weaving) or movements are more smeared than with the longer shutter speed of film (blending). You can do an inverse telecining and thereby get the original scanning speed as the image speed (every film image is a progressive video image), however, material that has been converted back in this way cannot be edited with material that runs at 24p, 25p, 25i or NTSC speeds.

This is the only case in which deinterlacing should not be performed.

See also

Web links