Glottochronology

Glottochronology (from Att.-Greek γλῶττα "tongue, language " and χρόνος "time") is the branch of lexicostatistics that deals with temporal relationships between languages. In particular, the advocates of the method claim to be able to calculate the time between languages regarded as related since the separation from the respective common predecessor. This is based on the assumption that the substitutions in a universal test list of all languages would have behaved constantly in the same way in all times as in some examples that were substantiated by written texts for a certain period of time. This assumption has been refuted.

Origin and development

"The" glottochronology does not exist. Beyond the above-mentioned basic assumption, we distinguish between two fundamentally different assumptions about the type of decay, as well as other additional factors by different authors. The results of the various approaches largely contradict one another. All calculations are based on word lists that are usually carelessly written down or from outdated works.

Method of equation with radioactive decay: "Classical glottochronology"

This direction is based on the formula of radioactive decay . Misunderstood again and again, this formula implies that at any point in time all remaining radioactive isotopes have the same decay probability, and thus the same percentage of the exponentially decreasing number of these isotopes decays in the same time periods . In the 1950s one learned to use these laws to determine the age of radioactive material.

This inspired the American linguist Morris Swadesh to also use the method to determine the age of languages. He equated the properties of the originally suspected word kernels in his test lists with those of radioactive isotopes, because both of them decrease over time. Due to the imprecise formulation, it is often overlooked that under this law the absolute number of the original elements decaying in successive periods, and only these, therefore decreases exponentially . In order to be able to compare as many different languages as possible, Swadesh designed word lists that should be as culture-independent as possible , ie "universal". The lists should also represent a vocabulary that is as stable as possible in order to maintain sufficient similarities even between distant related languages. He named these lists variously, most aptly the universal test list , but the lists were soon called Swadesh lists . "The" Swadesh list does not exist, by the way, because Swadesh has revised it several times: starting with 200, expanded to 215, and finally reduced to 100 (as published post mortem in 1972). There are also over a dozen drafts from other sites. First, Lees (1953) calculated the decay rate of 215 test terms in 13 languages with partially widely spaced text references, e.g. B. Ancient Egyptian . In 1955 Swadesh checked seven of them and at the same time compared his test list, which has now been reduced to 100 words.

Glottochronology soon met with severe criticism. Knut Bergsland and Hans Vogt demonstrated as early as 1962 that the assumption of constant replacement rates is not tenable. In 1973 Johann Tischler found that there were unreal separation dates for the Indo-European languages .

Advocates of glottochronology see the main reason for this in unrecognized borrowings , which were met in different ways:

The linguist Sheila Embleton used her previous knowledge of borrowings in the Germanic languages to extrapolate them quantitatively with additional algorithms, and thus achieved impressive results. However, these results only go back a little beyond the time of the first evidence. The complexity of their methodology and the uncertainties in the data analysis of other language families prevent further tests to this day.
The Russian linguist Sergei Anatoljewitsch Starostin simply counted on the “really important” internal innovations. He has u. a. placed an emphasis on the Dene-Caucasian hypothesis, where temporal references are problematic. His claim to have refuted the calculations by Bergsland and Vogt is based on questionably different etymological word interpretations. He groups Albanian into Greek , but Balto-Slavic into Indo-Iranian . In the same tradition, Václav Blažek (2007) uses extended, no longer so restricted word lists. Starostin finally came to the conclusion that instead of assuming meanings from etymological “roots”, he died in 2005 without having pursued this approach further.

Sheila Embleton (2000) and Hans J. Holm (2007) provide an overview of the history of research . Although the organizers of the Time Depth conference tried to find a balance, no full professor of Indo-European studies or comparative linguistics was found to support glottochronology.

All variants of this traditional approach are based on three flawed assumptions, namely that the words in the lists such as radionuclides (radioactive isotopes )

all have the same probability of “falling apart” (in the case of the words “being replaced”);
can only be disintegrated or replaced once;
this happens at a rate that is roughly the same for everyone.

Method of equating with evolutionary biological assumptions

Many - by no means all - bioinformaticians assume a fixed mutation rate of the genes , but - in contrast to the radioactive elements - their number does not decrease. The algorithms developed under these assumptions have also been mechanistically applied to Swadesh and other word lists in recent years. Most famous was a work by Gray and Atkinson. Despite the most modern procedures and tricky modifications of the replacement rates, the result is neither convincing in terms of time nor structure: In terms of time it pointed extremely into the past; structurally, Albanian was erroneously grouped into Aryan , Germanic into Italian . In addition, the representation hides the fact that initially there is only an undirected bundle of counts ( unrooted phylogeny ), and the position of the Hittite was introduced later (Holm 2007). The team with changing holdings publishes new and different family trees almost every year.

The variants of this "biological" approach are based - in addition to the already mentioned incorrect data - on two questionable assumptions, namely that the words in the lists such as genes and their alleles (1) "mutate" with calculable probabilities (in the case of the words "replaced" become); (3) This happens at a replacement rate that is roughly the same for all, which means that false-early separations must result for languages that died out early and with many replacements.

Basics

Sociological Aspects

In contrast to the assumption of the glottochronologists, linguistic change is not based on the effect of a perpetual motion machine , but has tangible, mostly understandable psycho-social and socio-historical reasons that are unpredictable and unpredictable. This statement remains true even if the effects in the above Swadesh lists occur less than in the rest (the so-called Zipf distribution ). In addition to the findings of Bergsland and Vogt, other counterexamples with socio-historical reasons can easily be found for those familiar with linguistic history and ethnology:

So there are z. B. Languages that have long received little outside influence . Reasons for this include their isolated location (so-called "conservative seams", e.g. in Icelandic ) or the cultural self-confidence that influences the language , e.g. B. in Greek against the influence of Romania .
The range of language change also includes a large number of intermediate stages and language mixtures, so-called pidgin and creole languages , not only today as a result of colonization and displacement .
Many languages have even died out over longer or shorter periods of time, e.g. B .: the Hittite language after 1200 BC BC, the language of many so-called pygmy tribes because of stronger external ties and economic dependence on neighboring Bantu tribes ; the language of the Veddas (Wedda) in Sri Lanka , who have adopted Sinhala or Tamil ; the Gallic after the submission to Caesar in today's France and many others, all completely independent of any "rate".

This socio-historical dependence of linguistic change has been and is repeatedly emphasized by all leading historical-comparative linguists around the world.

Historical-archaeological aspects

Above all (see above) the criticism of the results over time is ignited . Acceptable times in the rough period of the surveys are not evidence; the prehistoric results sought, however, cannot be verified. If necessary, the rates are "adjusted", e.g. B. Starostin has changed the rate calculated by Swadesh from 14% for the Indo-European languages to 5%.

The problem of the base times for the determination of assumed decay rates can be seen in the Scandinavian languages section . The separation of Icelandic can e.g. For example, let Iceland begin to be settled in the 9th century, but Norwegian did not develop until centuries later; Above all, the 4 to 19 borrowings in the Swadeshlist - depending on the author - did not “arise as a function of time”, but were essentially taken over during the approximately 300 years of Danish rule from 1536 to 1814.

Linguistic aspects

Criticism of the stability of certain semantic fields (cf. Haarmann 1990) in the Swadesh lists alone is irrelevant, since a certain change is not denied at all. The technical weaknesses of the test list, which are often (rightly) criticized, also do not hit the core of the method.

On the other hand, the poor linguistic quality of most Swadesh lists weighs more heavily . B. the Dyen-list, which has been freely available on the Internet for a long time; already from Sh. Errors in the English section complained about in 1995 were never corrected; The Albanian part contains a further twelve percent errors.

In many, if not most cases of the "language change" assumed by glottochronologists, it is not a question of a change over time, but rather substrates , remnants of a previously existing stock that have been preserved when a new standard language was adopted, for various reasons. Is known z. B. the maritime substrate of the Germanic languages (e.g. mast, keel, sails), i.e. lexemes from areas in which the natives had a higher level of competence than the immigrant carriers of the (here) Indo-European languages . The same applies to the technique of weaving . Examples from the Swadesh list are given by Aaron Dolgopolsky , teacher of the above S. Starostin, with what is otherwise the most knowledgeable criticism from a humanities point of view.

Mathematical-stochastic aspects

It is undisputed that most languages are subject to more or less strong influences and changes in the course of their history. Statistical summation can often result in roughly matching sums that are inexactly interpreted as "guessing". A comparison of many papers results in a normally distributed Gaussian curve , the dimensions of which require further studies.

The second fundamental assumption of glottochronologists is based on the assumption of a rate , namely that languages are more closely related the more common hereditary words they have. This ad hoc assumption, which is obvious at first glance, overlooks the fact that it is dependent on three other determining parameters ( proportionality error ). Here, in all glottochronological work, the basic mathematical rule is violated, namely to first analyze the stochastic distributions of the data used, in this case the hypergeometric distribution and the broken Zipf or Pareto distribution .

As a characteristic of the general assessment, an Indo-Germanist even classified glottochronology in the field of science fiction .

literature

Arne A. Ambros: Linguistic and statistical evaluation of lexical coincidence phenomena. In: Karl-Heinz Best, Jörg Kohlhase (Ed.): Exact language change research. Theoretical contributions, statistical analyzes and work reports. edition herodot, Göttingen 1983, ISBN 3-88694-024-1 , pp. 21-43.
Knut Bergsland, Hans Vogt: On the Validity of Glottochronology. In: Current Anthropology . Vol. 3, No. 2, April 1962, pp. 115-153.
Lyle Campbell: Historical Linguistics; An Introduction. Edinburgh University Press, Edinburgh 1998, ISBN 0-7486-0775-7 , Chapter 6.5. Glottochronology.
Aharon Dolgopolsky: Sources of linguistic chronology. In: C. Renfrew , A. McMahon, Larry Trask (Eds.): Time depth in historical linguistics. Vol 2 [16], The McDonald Institute for Archaeological Research, Cambridge, UK 2000, ISBN 1-902937-14-7 , pp. 401-409. (Probably the most well-founded contribution in the anthology)
Sheila Embleton: Lexicostatistics / Glottochronology: from Swadesh to Sankoff to Starostin to future horizons. In: C. Renfrew, A. McMahon, Larry Trask (Eds.): Time depth in historical linguistics. Vol 1 [7], The McDonald Institute for Archaeological Research, Cambridge, UK 2000, ISBN 1-902937-13-9 , pp. 143-167.
Harald Haarmann : Basic vocabulary and language contacts; the disillusion of glottochronology. In: Indo-European Research. No. 95, 1990, pp. 1-37.
L. Hoffmann, RG Piotrowski: Contributions to language statistics . Verlag Enzyklopädie, Leipzig 1979, pp. 162-174.
Hans J. Holm : Genealogical relationship. In: Quantitative Linguistics. (= Handbuch Sprach- und Kommunikationwissenschaften. Volume 27). de Gruyter, Berlin 2005, chap. 45.
Hans J. Holm: The new Arboretum of Indo-European 'Trees'; Can new algorithms reveal the Phylogeny and even Prehistory of Indo-European? In: Journal of Quantitative Linguistics. Volume 14, No. 2, 2007, pp. 167-214.
Hans J. Holm: Steppe homeland of Indo-Europeans favored by a Bayesian approach with revised data and processing. In: Glottometrics. Volume 37, 2017, pp. 54–81 (PDF, full text)
David Sankoff: On the Rate of Replacement of Word-Meaning Relationships. In: Language. Volume 46, 1970, pp. 564-569.
Morris Swadesh : Towards greater accuracy in lexicostatistic dating. In: International Journal of American Linguistics. Univ. of Chicago Press, Chicago 21.1955, pp. 121-137. ISSN 0020-7071
Morris Swadesh: What is glottochronology? In: M. Swadesh: The origin and diversification of language. Routledge & Kegan Paul, London 1972, ISBN 0-7100-7195-7 , pp. 271-284.
Johann Tischler : Glottochrony and Lexicostatistics. (= Innsbruck contributions to linguistics. Volume 11). Innsbruck 1973, pp. 143-167.

Web links

Wiktionary: glottochronology - explanations of meanings, word origins, synonyms, translations

Lexicostatistics and glottochronology (christianlehmann.eu)

Bernhard Ganter : Language Comparison - Qualitative Methods. Technical University of Dresden, lecture in summer semester 2004 at TU Dresden (math.tu-dresden.de)

Individual evidence

↑ Bergsland / Vogt 1962.
↑ Hans J. Holm: Albanian basic word lists and the position of Albanian in the Indo-European languages. In: Journal of Balkanology. Volume 45, No. 2, 2009.
^ Sheila M Embleton: Statistics in Historical Linguistics. (= Quantitative Linguistics. 30). Brockmeyer, Bochum 1986.
↑ Bergsland / Vogt 1962.
↑ Carpenter 1973.
^ Sheila M. Embleton : Statistics in Historical Linguistics . Brockmeyer, Bochum 1986, ISBN 3-88339-537-4 .
^ Václav Blažek : From August Schleicher to Sergej Starostin. On the development of tree-diagram models of the Indo-European languages. In: The Journal of Indo-European Studies. Vol. 35, No. 1, 2007, pp. 82-109.
^ RD Gray, QD Atkinson: Language-tree divergence times support the Anatolian theory of Indo-European origin. In: Nature. 426/2003, pp. 435-438.
↑ Gerhard Jäger : How bioinformatics helps to reconstruct the history of language. Tübingen, November 24, 2011. (sfs.uni-tuebingen.de)
^ Gerhard Jäger: Computational historical linguistics. University of Tübingen, Institute of Linguistics (arxiv.org)
^ Gerhard Jäger: Lexicostatistics 2.0. In: Albrecht Plewnia, Andreas Witt (Ed.): Sprachverfall? Dynamics - change - variation. (= Yearbook of the Institute for the German Language 2013 ). de Gruyter, Berlin / Boston 2014, ISBN 978-3-11-037474-2 , pp. 197-216. (ids-pub.bsz-bw.de)
^ Hans J. Holm: The new Arboretum of Indo-European "Trees"; Can new algorithms reveal the Phylogeny and even Prehistory of Indo-European? In: Journal of Quantitative Linguistics. Volume 14, No. 2, 2007, pp. 1-50.
^ Sheila M. Embleton: Statistics in Historical Linguistics . Brockmeyer, Bochum 1986, ISBN 3-88339-537-4 , p. 132 f.
^ V. Blažek: From August Schleicher to Sergej Starostin. On the development of the tree-diagram models of the Indo-European languages. In: The Journal of Indo-European Studies. Volume 35, No. 1-2, 2007, p. 85.
^ Sheila M. Embleton: Review of Dyen / Kruskal / Black: A Lexicostatistical Experiment. In: Diachronica. Vol. 12, No. 2, 1995, pp. 263-268.
↑ Aharon Dolgopolsky: Sources of linguistic chronology. 2000, p. 401 f.
↑ Hans J. Holm: The proportionality trap. Or: what is wrong with lexocostatistical subgrouping? In: Indo-European Research. No. 108, 2003, pp. 38-46.
^ W. Euler: Body part names in Albanian and their origin. In: Indo-European Research. No. 90, 1985, p. 104 f.

[1] Bergsland / Vogt 1962.

[2] Hans J. Holm: Albanian basic word lists and the position of Albanian in the Indo-European languages. In: Journal of Balkanology. Volume 45, No. 2, 2009.

[3] Sheila M Embleton: Statistics in Historical Linguistics. (= Quantitative Linguistics. 30). Brockmeyer, Bochum 1986.

[4] Bergsland / Vogt 1962.

[5] Carpenter 1973.

[6] Sheila M. Embleton : Statistics in Historical Linguistics . Brockmeyer, Bochum 1986, ISBN 3-88339-537-4 .

[7] Václav Blažek : From August Schleicher to Sergej Starostin. On the development of tree-diagram models of the Indo-European languages. In: The Journal of Indo-European Studies. Vol. 35, No. 1, 2007, pp. 82-109.

[8] RD Gray, QD Atkinson: Language-tree divergence times support the Anatolian theory of Indo-European origin. In: Nature. 426/2003, pp. 435-438.

[9] Gerhard Jäger : How bioinformatics helps to reconstruct the history of language. Tübingen, November 24, 2011. (sfs.uni-tuebingen.de)

[10] Gerhard Jäger: Computational historical linguistics. University of Tübingen, Institute of Linguistics (arxiv.org)

[11] Gerhard Jäger: Lexicostatistics 2.0. In: Albrecht Plewnia, Andreas Witt (Ed.): Sprachverfall? Dynamics - change - variation. (= Yearbook of the Institute for the German Language 2013 ). de Gruyter, Berlin / Boston 2014, ISBN 978-3-11-037474-2 , pp. 197-216. (ids-pub.bsz-bw.de)

[12] Hans J. Holm: The new Arboretum of Indo-European "Trees"; Can new algorithms reveal the Phylogeny and even Prehistory of Indo-European? In: Journal of Quantitative Linguistics. Volume 14, No. 2, 2007, pp. 1-50.

[13] Sheila M. Embleton: Statistics in Historical Linguistics . Brockmeyer, Bochum 1986, ISBN 3-88339-537-4 , p. 132 f.

[14] V. Blažek: From August Schleicher to Sergej Starostin. On the development of the tree-diagram models of the Indo-European languages. In: The Journal of Indo-European Studies. Volume 35, No. 1-2, 2007, p. 85.

[15] Sheila M. Embleton: Review of Dyen / Kruskal / Black: A Lexicostatistical Experiment. In: Diachronica. Vol. 12, No. 2, 1995, pp. 263-268.

[16] Aharon Dolgopolsky: Sources of linguistic chronology. 2000, p. 401 f.

[17] Hans J. Holm: The proportionality trap. Or: what is wrong with lexocostatistical subgrouping? In: Indo-European Research. No. 108, 2003, pp. 38-46.

[18] W. Euler: Body part names in Albanian and their origin. In: Indo-European Research. No. 90, 1985, p. 104 f.