Pattern analysis

from Wikipedia, the free encyclopedia

Pattern analysis is a branch of pattern recognition . Pattern analysis means the automatic generation of a description from the signal , the pattern . Examples of patterns are images, or image sequences and speech signals. In the pattern analysis, algorithms and system approaches for this problem are researched.

Unlike the classification methods of pattern classification , which one of last allocates many classes, is a pattern as a whole a pattern cut into the pattern analysis into sub-patterns and these sub-patterns and the relationships between them assigned a symbolic description. This corresponds to a mapping from the set of sub-patterns into the infinite set of all possible symbolic descriptions.

Typical pattern analysis systems

In contrast to the often homogeneous structure of pattern classification systems, speech recognizers or image recognizers, pattern analysis systems are structured heterogeneously . Nevertheless, there are some basic components; most systems only differ in their interaction.

Methods

The method component includes methods specifically tailored to the processing of, for example, speech signals or images, e.g. B. Kalman filters or snakes in images, summarized.

Qualitative knowledge representation

In order to represent knowledge about the application domain in an automatic pattern analysis system in an efficient and at the same time adequate manner, techniques from artificial intelligence are often used, e.g. B. semantic networks , frames , PL1 etc. This knowledge is often ambiguous, which is why the methods are prone to errors.

Explanatory component

Example from medicine: If medical input data such as If, for example, X-ray images from a pattern analysis system generate a symbolic output of the form “Patient X urgently needs surgery Y”, the doctor (and patient) asks himself why this operation is necessary and how the pattern analysis system came up with this answer. So intermediate steps are required here. These intermediate steps and the necessary explanations as to why which intermediate step took place are provided by the explanation component.

Learn

Most knowledge bases are created in laborious and expensive manual work by human experts and are therefore prone to errors. Different experts generate different knowledge bases. Machine learning is therefore entirely appropriate, but often not possible in reality.

Control component

The control component supplies the control strategy with which the represented knowledge is processed in the knowledge base. The special methods from the method component are used for processing. The strategy is often in the form of a search in graphs, trees or other types of search space, for example with the A * algorithm.

Examples of pattern analysis systems

A complete image analysis system

Here is an exemplary, complete structure of an image processing and analysis system. This is roughly divided into three parts: image processing, image classification and image analysis.

  1. Image processing
    1. Source: camera (digital camera, camcorder), scanner etc.
    2. digitized image : scanned , quantized image (e.g. 1024 × 768 gray value image, quantization: 8 bit, i.e. gray value 0 = black and 255 = white)
    3. Preprocessing: normalization of the image, application of filters for noise reduction or similar (image restoration ).
    4. Segmentation to subdivide the image into homogeneous areas (same color, same texture, etc.).
    5. Feature extraction : Combining important features of an image into feature vectors .
  2. Pattern classification
  3. Image analysis: Based on the pattern classification, an image recognition (only what can be seen is relevant, the relationships between the objects in the image are irrelevant) or an image interpretation (not just "car" and "human" in the image, but the interpretation, that the car runs over people) take place.

The Optoluchs image processing system from 1988 was one of the first systems in the field of machine vision .

Applications of image analysis

A complete speech analysis system

Here is an exemplary, complete structure of a language processing and analysis system. This is roughly divided into two parts: speech recognition and speech processing / speech analysis / speech understanding.

  1. Voice recognition:
    1. Sampling of the analog voice signal mostly with 8 or 16 kHz, 12 - 16 bit quantization per sample.
    2. Pre-processing: noise filter, removal of sections of pure silence or background noise, etc.
    3. Feature calculation : Window formation (via window function ): For example, a 16 ms long window is created every 10 ms (overlapping is desired). B. by cepstral analysis or by linear prediction (LPC, Linear Predictive Coefficients, see Linear Prediction ) features can be calculated and combined to feature vectors. During the feature calculation, aurally accurate distortion of the signal often takes place (see psychoacoustics , MFCC , Mel scale , Bark scale and ear ).
    4. Classification and search : Assignment of feature vector sequences to polyphones or words using Hidden Markov Models (HMM). A word graph or a list of the n-best word strings is created.
    5. Speech recognition: the actual speech recognition, i.e. the textual representation as a reconstruction of what is actually said, takes place as a combination of acoustic model (HMM) and language model (often N-grams )
  2. Speech processing / speech analysis:
    1. Prosody recognition : gives indications of prosodic features of the language, such as intonation , accent or rhythm . This information is useful in further, constructive analyzes to resolve ambiguities.
    2. syntactic analysis : delivers the parsed utterance (e.g. using an LR parser ).
    3. semantic analysis : based on the syntactic structure of the parsing process, e.g. B. in the form of a syntax tree , a meaning analysis takes place
    4. Pragmatics : Sometimes the meaning of a sentence can only really be understood by taking the context into account.
    5. Dialog system: The interpreted utterance can now be fed to a dialog system (e.g. a robot), which is then able to generate a suitable response using speech synthesis .

Possibilities to represent knowledge

The explicit representation of knowledge is a necessary requirement for pattern analysis systems . In contrast to artificial intelligence , however, the problems of uncertain input data and competing hypotheses arise, so that the control of system activities is of great importance. In addition to AI methods, database systems for organizing knowledge and storing intermediate results are also considered. Various calculi such as fuzzy logic or Bayesian networks are used to evaluate hypotheses .

Qualitative relational representation options

General formalisms of representation

In general, semantic networks are often used because they can be used to build up knowledge bases intuitively and clearly . Furthermore, knowledge representation languages ​​such as KL-ONE , frames or predicate logic are often used.

Speech data analysis

Formal grammars and automata are often used in the field of speech data analysis. For example, the syntactic structure of textually represented language can be efficiently checked for correctness with regard to an LR grammar with an LR parser , in combination with feature structures at the same time the congruence of sentence fragments with regard to case, gender and number by unification .

Image data analysis

The semantic network language offers a special (language) and image data analysis method.

In image processing , attributed graphs are used to represent 2D or 3D objects. Do you work z. B. on a region-based segmentation, the segmented regions can be represented as nodes and the relationships between regions as edges in the graph. The node attribute would e.g. B. the color value of the region and as an edge attribute the position relation such as "below-from" etc. in question. Graphs for known objects are called model graphs, depending on the scenario there is a more or less large number of model graphs. The aim of the object recognition is to find one or more of these model graphs in the segmented image. If the segmented image is represented as a graph, the task transforms itself into a comparison of all model graphs with the input graph. If the input graph contains a model graph as a subgraph, the search was successful. Mathematically speaking, this is the search for subgraph isomorphism with error correction.

Quantitative representation of knowledge

Numerical classifiers , Markov random fields and Bayesian networks are used here.

Control strategies

See also

literature

  • G. Sagerer: Automatic understanding of spoken language. (= Computer Science Series. Volume 74). BI-Verlag, Mannheim 1990, ISBN 3-411-14391-6 .
  • H. Niemann: Pattern Analysis and Understanding. (= Springer Series in Information Sciences. Volume 4). Berlin 1990, ISBN 3-540-51378-7 .
  • PC Lockemann, JW Schmidt (ed.): Database manual. Springer, 1987, ISBN 3-540-10741-X .
  • A. Pinz: Understanding images. (= Computer science textbooks ). Springer, Vienna 1994, ISBN 3-211-82571-1 .

Individual evidence

  1. Mobile price comparison with image recognition / barcode reader