Segmentation (image processing)

The segmentation is a branch of digital image processing and computer vision . The generation of content-related regions by combining neighboring pixels or voxels according to a certain homogeneity criterion is called segmentation.

classification

In the machine vision process, segmentation is usually the first step in image analysis and comes after image preprocessing. So the process is

Scene → image acquisition → image preprocessing → segmentation → feature extraction → classification → statement

properties

One speaks of complete segmentation when each pixel is assigned to at least one segment. In the case of segmentation without overlap , each pixel is assigned to at most one segment. A full and overlap free segmenting each pixel is so closely associated with a segment. A segmentation is called contiguous if each segment forms a contiguous area.

Procedure

Many methods of automatic segmentation are known. Basically, they are often divided into pixel, edge and region-oriented processes. In addition, a distinction is made between model-based methods, in which a certain shape of the objects is assumed, and texture-based methods, in which an internal homogeneous structure of the objects is also taken into account.

The boundaries between the procedures are often fluid. You can also combine different methods to achieve better results.

Of course, the segmentation can also be carried out in a non-automatic process, i.e. a person makes the division. Since the automatic processes are far from perfection, there is also the possibility of semi-automatic processing.

Pixel-oriented process

Pixel-oriented methods make the decision for each individual pixel whether it belongs to a certain segment or not. This decision can - but does not have to - be influenced by the environment. Point-oriented methods are usually easy to calculate and therefore fast, but do not initially provide any connected segments per se. One speaks of segmentation when individual objects can be counted on a binarized image. Each segmented object is then z. B. described by a run length coding of the binarized pixels. The binarization is the preliminary stage of a segmentation.

The most common binarization method is certainly the threshold value method . This method is based on a threshold value that is best determined using a histogram.

Example image

Image after binarization with threshold value 90

In the illustration, the background is lighter than the black object. In the simplest case, the threshold value of the binarization results from the mean value of the darkest and lightest gray value in the image. Segmentation is often the preliminary stage of classification.

Edge-oriented processes

In this process, edges or object transitions are searched for in the image . Many algorithms do not yet provide closed edge trains ; these must first be combined with further processes so that they include objects. In fact, edges always lie between the pixel regions of an image. The results of an algorithm can be polygons (or lines or, in special cases, curves ), but some operations also return the edges as pixels of different colors. With the OpenCV software , each segmented object is described by an enclosing polygon. Segmentation can also be used to divide an image into a foreground plane and a background plane.

With methods such as the Sobel operator and the Laplace operator , as well as a gradient search, pixels belonging to an edge can be found. As a rule, however, these are initially still loose and have to be completed with edge tracking algorithms . A popular method for generating a coherent object silhouette or at least edge trains from the edge pixels is the Live Wire method by E. Mortensen , WA Barrett and JK Udupa . The idea can clearly be compared with a navigation system that determines an optimal route from the start to the destination. In the context of segmentation, optimal means that the path between start and destination always leads over the strongest edge pixels. The optimal route selection is then a standard problem in computer science and can be solved, for example, with a breadth-first search .

Another well-known process is the watershed transformation , which works on gray-scale images and always provides closed edges. Further methods are parallel and sequential edge extraction , optimal edge search, the Felzenszwalb-Huttenlocher algorithm , active shape models and snakes .

Region-oriented procedures

The region-oriented methods consider point sets as a whole and try to find connected objects. Methods such as region growing , region splitting, pyramid linking and split and merge are frequently used .

Mathematically more demanding, the image cannot be understood as a matrix of pixels, but as a continuous function that, for example, maps the unit square into the color space (e.g. for a grayscale image). ${\ displaystyle u_ {0}: [0; 1] ^ {2} \ rightarrow \ mathbb {R}}$

Energy methods assign a real energy value to every possible segmentation of the image and look for a minimum of this energy functional. In this context, segmentation is understood to mean an image with areas of uniform (often constant) color intensity; a set separates the regions . Different energies can be used depending on the field of application. Mostly one: ${\ displaystyle u}$ ${\ displaystyle C}$

The difference between segmentation and original image, e.g. B. ${\ displaystyle \ int (u-u_ {0}) ^ {2}}$
A measure of the length of the edges between individual segmentation areas, for example the two-dimensional Hausdorff measure as the length of the segmentation edge . ${\ displaystyle C}$ ${\ displaystyle {\ mathcal {H}} ^ {2} (C)}$

If the segmentation areas do not have to have a constant intensity: A measure of intensity differences such as .

{\ displaystyle \ int _ {[0; 1] ^ {2} \ setminus C} | \ nabla u |}

Possible solution methods are then:

Graph-cut methods that are based on the continuous model, but still result in a discrete algorithm,
Variation methods that achieve a decrease in the energy function as a solution to a partial differential equation.

The former are currently feasible for smaller images in real time (30 fps), but offer maximum pixel accuracy. The variation approach, on the other hand, also allows subpixel accuracy. This is particularly helpful with diagonal edges, which always create staircase effects with discrete processes. Methods are currently being researched to solve the variation approaches on the processors of graphics cards (Graphic Processing Unit, GPU). Speed advantages of a factor of 5 to 40 are predicted, which means that the variation approaches would be considerably faster.

Continuous methods have only been explored with visible success since around 2002 and are therefore not yet found in end-user software.

Model-based procedures

This is based on a model of the objects sought. This can concern the shape, for example. So you use knowledge about the image. A well-known method is the Hough transformation , with the help of which points can be joined to form lines or circles by mapping them in a parameter space. Statistical models and segmentation using templates (template matching) are also used. With the latter method, the image is searched for given templates.

Texture-oriented methods

Some picture objects do not have a uniform color, but a uniform texture . For example, an object can have grooves that appear in the photograph as alternating strips of dark and light colors. So that these objects are not broken down into many small objects on the basis of the texture, approaches are used with which one tries to counter this problem. Some of these methods are at the limit of classification or allow simultaneous segmentation and classification.

Cooccurrence matrices (Haralick matrices)
Texture Energy Measure
Run-length matrices
fractal dimensions and measurements
Markov random fields and Gibbs potentials
structural approaches
signal theory concepts

Problems

The quality of a segmentation is often not optimal. In these cases, you can choose a better method, or you can optimize results by a pre-processing (including preprocessing ) connects or post-processing. Both can be done automatically (if you have already identified the problems in the process) and manually.

A problem with many segmentation algorithms is the susceptibility to changing lighting within the image. This can lead to the fact that only one image part is correctly segmented, but the segmentation in the other image parts is unusable. Differences in brightness can be compensated for with pre-processing, for example by applying a shading correction.

Common problems include oversegmentation (too many segments) and undersegmentation (too few segments). This can be countered by enriching the method with knowledge of the data to be processed; in the simplest case, one can specify the expected number of segments. In addition, you can insert a subsequent classification step in order to combine segments that are classified in the same way. Of course, the segments can also be combined by hand.

Many of the algorithms presented (threshold value method , watershed transformation ) only work on single-channel grayscale images. When processing multi-channel images (e.g. color images), information remains unused. Further processing steps are required in order to combine several single-channel segmentations.

Applications

Model of a segmented and classified left human femur . It shows the outer surface (red), the interface between compact and cancellous bone (green) and the interface between cancellous bone and bone marrow (blue).

Segmentation is often the first step in image analysis for subsequent processing of the data, for example classification .

The applications for such processes are diverse. Automatic segmentation is currently used most frequently in medicine , for example in computed tomography or magnetic resonance tomography . Segmentation is also used in geodata processing, for example satellite images or aerial images (see remote sensing ) are segmented into geometric data. Segmentation is also used for automatic optical quality control of workpieces (for example: is the drill hole in the right place?). Segmentation is also used in font recognition (OCR) to separate text from the background by binarizing the scanned image. Another topic is face recognition .

software

Image processing programs

Image processing programs like the free Scikit-image offer segmentation algorithms and 'higher' image processing algorithms based on different segmentation algorithms. These programs can be used, for example, to determine the positions of objects in a robotics application (see image processing ).

Image editing programs

Many image processing programs, such as the free GIMP and the free IrfanView, offer simple segmentation algorithms, such as threshold value methods or edge detection with Sobel or Laplace operators.

Character recognition and special software

As a first step, handwriting recognition programs can use segmentation to separate the text from the background.

Medicine and geoinformatics are frequent target areas of application for the special software for segmenting images .

literature

Rafael C. Gonzalez, Richard E. Woods: Digital Image Processing . Addison-Wesley, Redding 1992, ISBN 0-201-50803-6 . (English)
Rainer Steinbrecher: Image processing in practice . Oldenbourg, Munich and Vienna 1993, ISBN 3-486-22372-0 .
Thomas Bräunl, Stefan Feyrer, Wolfgang Rapf, Michael Reinhardt: Parallel image processing . Addison-Wesley, Bonn 1995, ISBN 3-89319-951-9 .
Thomas Lehmann, Walter Oberschelp, Erich Pelikan, Rudolf Repges: Image processing for medicine . Springer, Berlin and Heidelberg 1997, ISBN 3-540-61458-3 .
Bernd Jähne : Digital image processing . 5th edition. Springer, Berlin 2002, ISBN 3-540-41260-3 .

Web links

Tracking of objects marked in color (PDF; 400 kB)

Individual evidence

↑ scikit-image.org: Module: segmentation - skimage docs. Retrieved September 8, 2018 .

[1] scikit-image.org: Module: segmentation - skimage docs. Retrieved September 8, 2018 .