Co-occurrence

from Wikipedia, the free encyclopedia

In general linguistics, co-occurrence refers to the joint occurrence of two lexical units (e.g. words ) in a higher-level unit, such as in a sentence or a document. It is assumed that these two terms are dependent on each other if they appear conspicuously often together. Statistical tests provide measures for the suspected dependency, such as various variants of the transinformation or likelihood ratio tests . This can have both grammatical and semantic reasons.

If there is a proven grammatical or semantic dependency between two terms that frequently appear together, this is called collocation .

Both terms are very important in information recovery .

Examples of co-occurrences

  1. I am sitting in the bench and I go to the bench and I am sitting on the bench brings the verbs sit and do not accidentally go into a connection with the object bench . Only in the second example sentence is there still an ambiguity . The verb does not resolve the ambiguity, as it can still be a bench or a building. However, sitting does not depend on the bench , it could also be a chair - but the probability of sitting on a bench is greater than sitting on a leash or cooking the bench.
  2. On the other hand, idioms are fixed co-occurrences because they are rigid expressions such as: it is raining twine.
  3. The probability of expectation is also very high if an if occurs in a sentence that then follows in order to express a causal connection with precondition and conclusion. As you can see here, the existentially connected terms do not have to be in a row, but they are in a logical sequence.

Record and neighborhood occurrence

In the practice of text mining, a distinction is made between sentence occurrence (lexical units appear together in a sentence) and neighborhood occurrence (lexical units stand next to each other). It would also be conceivable to consider the text in larger contexts (paragraph or document co-occurrence), but in practice these are not considered, not least because of the high computing effort involved in machine processing.

Web links

Wiktionary: co-occurrence  - explanations of meanings, word origins, synonyms, translations

Individual evidence

  1. ^ Bußmann, Hadumod : Lexicon of Linguistics. Kröner, Stuttgart 2002.