Theory of Visual Attention

from Wikipedia, the free encyclopedia

The Theory of Visual Attention (TVA) is a theory of visual attention that is formulated with the help of mathematical equations . Due to its complexity, it can explain many psychological findings, including those from experiments that do not deal with visual attention.

The TVA is a unified theory of recognition and selection. While many theories of visual attention separate these processes both in terms of time and structure, the TVA assumes that both processes are realized in a united mechanism in the form of a race. In other words: if an object is recognized in the visual field, it is also selected at the same time and vice versa.

By combining selection and recognition, the TVA tries to resolve the long-discussed problem of whether selection is early (i.e. before stimuli are recognized, see Broadbent, 1958 ) or late (i.e. after, for example, the analysis of the stimuli in terms of content, see German & German , 1963 ) takes place.

The TVA explains attention through two successive processes, filtering and categorizing ("pigeonholing"). On the first level, the perceptual features are represented and weighted, while on the second level these features are categorized (for example “object X has feature i” or “object X belongs to category A”).

During the filtering process, all objects in the visual field compete against each other in a kind of race, the prevailing object can only then be categorized. Such categorization also means that the object in the visual short-term memory (VSTM, engl. "Visual short term memory") has been encoded. If there is no space in the VSTM, the object cannot be categorized and is therefore not deliberately processed.

Filter

At this first level, all objects in the receptive field are weighted. Where is the weight of an object

,

in which

  • is the set of all (visual) categories
    • a visual category can be a certain color, shape, orientation, etc.
    • all categories are calculated "on the same level", ie not sorted / weighted according to dimensions
  • the sensory evidence for this is that object x belongs to category j
    • the sensory evidence can e.g. B. be reduced by a blurred representation of the object
    • the sensory evidence can e.g. B. also be increased by similarity to be observed categories
  • is the relevance of category j to the observer
    • those categories that are more important to the observer carry more weight

Thus, top-down processes so far taken into account, as the relevance flows in a specific category for the observer in the weighting. At the same time, bottom-up processes are also taken into account through the sensory evidence .

Examples

Example 1 In a search task, a test person should indicate which red digits are presented. Red letters and blue numbers and letters serve as distractors.

Now one can calculate the attentional weight of these objects:

Theoretically, all other categories should also be included; they are omitted here for the sake of simplicity, since their relevance is 0 and therefore they have no influence on the values.

In this simple example, the weights of the individual objects can be easily calculated. Let us first assume that the weight of the “red” category is 0.9 and that of the “blue” category is 0.1. For the sake of simplicity, let the sensory evidence be 1 or 0 (i.e. red is always perceived as red, blue never as red etc.).

The blue objects are given a very low weight, while all red objects are given a high object weight. It is important that answer categories such as “number” or “letter” do not yet play a role, only filter categories “red” or “blue”.

This example is very simple, since only objects of one category (“red”) have to be taken into account to select the answer. In example 2 , different categories will now have to be considered.

Example 2 In a search task, a test person should decide whether a red triangle stands on the tip or the tip points upwards. Blue triangles, blue circles and red circles, which are presented at the same time as the red triangle, serve as distractors .

Now one can again calculate the attentional weight of these objects:

Let us assume that the relevance of the category “red” is 0.9, for “blue” 0.1 for “triangle” 0.6 and for “circle” 0.01. For the sake of simplicity, let the sensory evidence again be 1 or 0 (i.e. red is always perceived as red, a triangle never as a circle, etc.). This results in the following attentional weights for the four objects:

So the red triangle has the highest weight and is more likely to be processed further than any other object in the field of view. However, it is by no means certain which object will win the “race” or how it will be categorized. This is determined in the categorization process.

Categorization

In the race for categorization, the processing speed of each object categorization is calculated as follows:

in which

  • the speed of the categorization "object x is i" is
    • theoretically there is a processing speed for every object-category combination
    • the processing speed corresponds to the probability that the object x will be categorized as i (and thus encoded in the VSTM)
  • the sensory evidence for this is that object x belongs to category i
  • is a perceptual response bias related to category i
    • those categories that are relevant for the answer are weighted higher
  • is the proportion of the weight of object x in the total weight of all objects

Note that not every object has a processing speed, but every object-categorization combination. So there is a processing speed for the categorization “Object x is a” and for the categorization “Object x is b”. However, the object whose object-categorization combination wins the race is actually also encoded in the VSTM.

example

Following the above example 1 of the visual search task, the processing speed with which the categorizations take part in the "race" is now considered in the categorization phase (attention: not the objects themselves, but the objects with a certain object categorization compete for a place in the VSTM!). The total weight in our example is thus the relative weight for the two red objects and for the two blue objects .

Since the test person's task is to indicate which letter (s) (among the red objects) is to be found, the test person's answer, provided he follows the instructions, can be in one of the 26 (because 26 different letters) possible Answer categories fall. The categories “red” and “blue” are no longer of importance here, but only “a”, “b”, “c” etc. Consequently, there are 26 values, e.g. B. and high, on the other hand the values ​​for digits (or completely other categories such as "flower") are very low.

The physical stimulus quality is also decisive for such a categorization . It should be noted here that, for example, a “2” can resemble a “Z” and thus have a relatively high sensory evidence for one of the response categories, namely “Z”, although it is not a target stimulus.

Now let's calculate some processing speeds from our example. The physical stimulus quality is again perfect and thus 0 or 1. The exception in our example is "2". For them apply . The perceptual decision bias is 0.8 for letters and 0.05 for digits.

This results in:

NTVA

In 2005 the TVA was further developed into NTVA (Neural Theory of Visual Attention). The previously criticized non-existent explanation of TVA on the neural level was produced here.

Individual evidence

  1. Claus Bundesen: A theory of visual attention. In: Psychological Review . tape 97 , no. 4 , 1990, ISSN  1939-1471 , pp. 523-547 , doi : 10.1037 / 0033-295x.97.4.523 ( apa.org [accessed June 6, 2018]).
  2. Claus Bundesen, Signe Vangkilde, Anders Petersen: Recent developments in a computational theory of visual attention (TVA) . In: Vision Research . tape 116 , November 2015, ISSN  0042-6989 , p. 210–218 , doi : 10.1016 / j.visres.2014.11.005 ( elsevier.com [accessed June 6, 2018]).
  3. ^ DE Broadbent: Perception and communication. 1958, doi : 10.1037 / 10037-000 ( apa.org ).
  4. JA Deutsch, D. Deutsch: Attention: Some theoretical considerations. In: Psychological Review . tape 70 , no. 1 , January 1963, ISSN  1939-1471 , pp. 80-90 , doi : 10.1037 / h0039515 ( apa.org ).
  5. Claus Bundesen, Thomas Habekost, Søren Kyllingsbæk: A Neural Theory of Visual Attention: Bridging Cognition and Neurophysiology. In: Psychological Review . tape 112 , no. 2 , 2005, ISSN  1939-1471 , pp. 291–328 , doi : 10.1037 / 0033-295x.112.2.291 ( apa.org [accessed June 6, 2018]).