Image growing

Image Growing ( English called "image cultivation"), commonly also called "texture synthesis by Efros and Leung" is a nonparametric texture synthesis algorithm . The technique was introduced in 1999 by Alexei A. Efros and Thomas K. Leung.

The aim of non-parametric texture synthesis is to automatically generate new images from a template image that look as similar as possible to the template but are not identical to it. Image Growing lets a new, similar digital image "grow" pixel by pixel from a raster graphic .

functionality

There are two variants of this technique, which differ only insignificantly. With the actual image growing, a small piece of the template is housed as a seed in a largely empty image . The algorithm then lets the texture “grow” pixel by pixel around the seed. If all empty image areas are completely filled, the synthesis is finished; the texture can be "harvested". In the Hole Filling variant , the starting situation is exactly the other way around: A largely filled image contains a few holes , empty spaces that are filled by the algorithm. The functionality is the same in both cases.

In order to set a pixel, Image Growing searches the template for pixels that have a similar environment and combines them into a candidate list . The new pixel adopts the color value of a pixel selected at random from this candidate list.

input

In its basic form, the algorithm receives three inputs:

Template: Any digital image can be used as a template . Size, content, graphic format , color depth and color space are irrelevant from the point of view of the algorithm and only depend on the respective implementation and the hardware used.

Prepared Edition: This is a digital image with the desired dimensions of the output. It contains both pixels with image content and pixels with a special color value empty . The algorithm works on this image and fills in all pixels marked as empty . When the algorithm is finished, this image contains the synthesized texture. The color value empty must not appear in the template, if possible it should not be a valid color value at all.

Environment: This filter mask indicates which pixels belong to the environment of a pixel and how much they are taken into account in the environment comparison. Clearly, it is a matrix with an odd number of rows and columns, which contains numbers between 0 and 1, which add up to 1; the larger a number, the more important the associated pixel is when comparing surroundings. The middle entry is ignored and can therefore contain any value. The article image processing provides details on the complex topic of filter masks .

${\ displaystyle {\ tfrac {1} {16}} \ cdot {\ begin {pmatrix} 1 & 2 & 1 \\ 2 & 4 & 2 \\ 1 & 2 & 1 \ end {pmatrix}}}$
Gaussian-like 3x3 environment.

In addition to these main arguments, the algorithm requires a further parameter that can either be queried as user input or permanently specified in the algorithm:

Threshold value: The threshold value indicates the level of similarity from which a pixel counts as a candidate. The value is greater than or equal to 0 and has a potentially unlimited upper limit. The lower the value, the more similar the environments must be.

algorithm

Determine the amount of empty pixels that will be set in this pass. These are all pixels of the output image that are directly adjacent to a non-empty pixel vertically, horizontally or diagonally. If there are no more empty pixels, terminate the algorithm.
Choose any pixel from the set of empty pixels. The candidate list is now created for this pixel in an inner loop:
1. Select a pixel of the template image that has not yet been viewed in this inner loop.
2. Lay the original image, output image and filter mask on top of each other so that the pixels you are currently viewing and the center of the filter mask are directly above each other; in the following only the area under the filter mask is considered. Subtract the superimposed color values of the two images from each other, take the amount and multiply it by the number on the filter mask above. Add up all the values found. Important: Pixels that are empty and areas where the images do not overlap are completely ignored.
3. If the value just determined is less than the threshold value, add the template pixel under consideration to the candidate list. If not all pixels of the template have been considered, continue with 2.1.
Randomly select a pixel from the candidate list and assign its color value to the currently viewed pixel of the output image.
Remove the just placed pixel from the set of empty pixels. If the set is empty, go back to 1, otherwise to 2.

output

The output of the algorithm consists of the synthesized image that is in the modified prepared output .

theory

A texture in texture synthesis is always a stationary image, ie all parts of the image look similar; If you look at different areas of such a texture through a small viewing window, you get the impression of always seeing the same thing. Texture synthesis according to Efros and Leung makes use of this special property by modeling texture as a Markov network . A Markov network consists of interconnected objects, so-called nodes , which can assume different states. In such a Markov network, the so-called Markov property applies : The state of each node depends on the states of the pixels surrounding it in a locally limited environment.

In a texture, each pixel represents a node in a Markov network, with each pixel connected to its eight neighboring pixels. The locally limited environment is given by the environment mask; this indicates not only how big the environment is, but also how a pixel is influenced by its neighbors. The Markov property is closely related to stationary; if one tries to model a non-stationary image in a similar way, one finds that the state of a pixel depends on all other pixels.

evaluation

Image Growing produces high quality images if the parameters are chosen correctly. As a rule, the larger the filter mask, the better the results. The size should be chosen so that the filter mask covers the largest structure in the template.

AA Efros and TK Leung themselves noticed an important weak point: Sometimes the algorithm stiffens itself in its search for candidates on a small part of the template and produces unsatisfactory results from there. Larger structures then gradually disintegrate into image noise or parts of the original are copied identically.

running time

According to the classic model of the register machine , the following parameters influence the runtime of the algorithm:

Number n of pixels in the original image.
Number m of pixels in the output image.
Number b of table cells of the filter mask.

The running time is estimated asymptotically upwards by O (m · n · b). The repeated search for pixels that are still empty is generously neglected, as it can turn out very differently depending on the process and progress. Since m >> n and b> 1 usually apply, this running time is significantly worse than O (n ² ). The algorithm is very slow even on fast computers.

Memory requirements

The memory requirement of the algorithm is comparatively small. It varies over time with the size of the number of pixels to be set and the size of the candidate list, but the image sizes can be used as upper bounds for these.

Approaches to improvement

Empty pixels

In the original approach, the image is searched in full on each pass in order to pick out those empty pixels that will be set in the next pass. This exhaustive search is far from optimal and can be sped up.

A simple approach is to dictate the size, shape and positioning of the seeds. This makes it possible to predict the next empty pixels in each step without even having to look at them. This approach is particularly impractical for hole filling, where the position, size, and shape of the empty areas cannot be predetermined. A slight increase in quality can also be observed when the seed is positioned in the center of the image.

More complicated approaches use additional data structures that are generated in advance and allow for faster searches for empty pixels. For example, a graph would be conceivable in which each pixel refers to those pixels that can be set once the pixel itself has been set.

Candidate list

The creation of the candidate list can also be accelerated by including additional search structures. For example, texture synthesis with binary tree-supported vector quantization suggests a special binary tree in which the pixels and their surroundings are cleverly arranged.

swell

^ AA Efros, TK Leung: Texture Synthesis by Non-parametric Sampling. In: Proceedings of IEEE International Conference on Computer Vision, Corfu, Greece, September 1999. PDF 1 MB

[1] AA Efros, TK Leung: Texture Synthesis by Non-parametric Sampling. In: Proceedings of IEEE International Conference on Computer Vision, Corfu, Greece, September 1999. PDF 1 MB