Language families of the world: Indo-European languages ​​highlighted in yellow

The Indo-European or Indo-European languages form the most speaker-rich language family in the world today with around three billion native speakers . Their widespread distribution is the result of migrations over the millennia and, most recently, of European expansion since the 15th century.

The languages ​​belonging to this language family show extensive similarities in vocabulary , in inflection , in grammatical categories such as number and gender as well as in ablaut . A single, prehistoric Indo-European original language is assumed as the common origin , the main features of which could be reconstructed by comparing the individual languages .

A hypothetical family tree of the Indo-European languages

The designation

The two common terms are brackets that are based on the (pre-colonial) geographical distribution of the language family. They are used according to the knowledge and tradition of the early 19th century, when nothing was known about Hittite and Tocharian .

The expression Indo-European , which is common in German linguistics, is based on the geographically most distant language groups of the (pre-colonial) distribution area, the Indo-Aryan languages in the southeast (with Sinhalese in Sri Lanka) and the Germanic languages with Icelandic in the northwest. This designation was introduced as langues indo-germaniques in 1810 by the Danish-French geographer Conrad Malte-Brun (1775-1826), who assumed an extension of the language family from the Ganges to the Oceanus Germanicus ( North Sea ). Heinrich Julius Klaproth will later introduce the term “Indo-European” in his Asia Polyglotta published in 1823 in German-speaking countries. Franz Bopp (1833), however, speaks of the "Indo-European" languages.

The word formations Indo-European and Indo-European are therefore not to be understood in such a way that the word part on the right -germanic / -european represents the core of the word and consequently would classify all peoples involved in this way. The more modern term Indo-European (analogous to Indo-European ) must also be understood as "languages ​​that occur in an area from Europe to India". So are z. B. Persian , Kurdish or Armenian "Indo-European" languages ​​whose homeland is neither in Europe nor in India, the same applies to the extinct languages ​​Hittite and Tochar.

Some researchers compare the Anatolian languages , which were split off at an early age, to all of the other Indo-European languages ​​as the primary branch and describe all of these languages ​​as Indo-Ethite . This term is largely rejected in Indo-European studies today, since the Anatolian branch, despite its certainly early split as one of several primary branches of Indo-European - such as e.g. B. Germanic, Italian , Celtic or Indo-Iranian - is viewed. The term Aryan languages , which was also used in British linguistics in the 19th century, is completely out of date . In English-language literature, however , Aryan (Aryan) is still used for the subgroup of Indo-Iranian languages.

Origin and development

The Indo-European languages ​​are considered genealogically related, i.e. H. as " daughter languages " of a "mother tongue" of the no longer preserved Urindo-European (Proto-Indo-European (PIE)). The fact that their similarity only came about through typological alignment in the manner of a linguistic union can be ruled out due to the numerous regular correspondences. The long-known fact that the Romance languages ​​are to be regarded as the successors of the Latin or the vulgar Latin language, as well as some similar cases as the Scandinavian languages , led to the concept of the language family . This was also transferred to groups of languages ​​that appeared in the same way from a common forerunner language, but were not known through texts, but whose former existence could only be inferred hypothetically and reconstructively. The reconstruction is based primarily on similarities in grammatical forms and on related words ( cognates ). A high number of cognates indicates a genealogical relationship if the vocabulary to be compared comes from the basic vocabulary .

In the sense of the family tree theory, ancient Indian was once thought to be the common original home of Indo-European. In 1808 Friedrich Schlegel described India as the original home of the Indo-European peoples and languages. Since it coincided with the biblical tradition of the original home of the people in Asia, the idea was quickly taken up.

The Florentine scholar and merchant Filippo Sassetti , who traveled to India via Constantinople and Tehran , began to be interested in Sanskrit in addition to his commercial activities . Around 1585 he noticed the striking word similarities between the Indo-Aryan languages and Italian .

As early as 1647, the Dutch linguist and scholar Marcus Zuerius van Boxhorn established for the first time a fundamental relationship between a number of European and Asian languages; originally he included the Germanic as well as the "Illyrian-Greek" and Italian languages ​​on the one hand and Persian on the other hand, and later he added the Slavic, Celtic and Baltic languages. Van Boxhorn called the common original language from which all these languages ​​should derive as Scythian . However, in the 17th century he was unable to assert himself with this thesis.

In 1786 the English orientalist William Jones recognized from the similarities of Sanskrit with Greek and Latin that there must be a common root for these languages. He already indicated that this could also apply to Celtic and Persian.

In 1816, the German Franz Bopp brought in his book On the Conjugation System of the Sanskrit language in comparison with that of the Greek, Latin, Persian and Germanic languages, the methodological proof of the relationship between these languages ​​and thus founded German Indo-European studies . This Indo-European original language could be obtained through reconstruction (see: Comparative Linguistics ).

Various theories try to offer explanatory models for the dissemination of the Indo-European languages ​​in space and time and thus also with regard to their differentiation and further development as well as to consider a uniform Indo-European proto- language. Four well-known theories are:

The German linguist August Schleicher tried to present the development and relationship structure of the Indo-European languages ​​in his famous family tree theory. There are both secured and speculative branches in this tree; the latter particularly concern extinct languages ​​that have left no follow-up languages. Schleicher tried to reconstruct the hypothetical Urindo-European by using original forms of various Indo-European languages. This resulted in a translation of the so-called Indo - European fable The Sheep and the Horses as Avis akvasasca .

Hermann Hirt founded the substrate theory and used the image of language layers that overlap. The linguistic basis for these overlays is the substrate on which the superstrate is placed as a linguistic overlay or attaches to one another, in the sense of an astrate effect in bilingualism. Immigrating Indo-European ethnic groups could have transferred their language to the regional peoples.

The wave theory came from Hugo Schuchardt and Johannes Schmidt; it replaces the idea of ​​a family tree, which is said to have developed from an Indo-European original language, by the model of a wave with concentric circles that become weaker with increasing distance from the center. According to this model, the various Indo-European language groupings and individual languages ​​have been separated from a relative original language unit and have subsequently led to linguistic innovations through a variety of transitional dialects through wave-like spreading .

It can generate word roots and morphological and phonological , even (with restrictions) syntactic be reconstructed features of Indo-European. A basic language in the sense of a precise communicative understanding is not achieved with this reconstruction.

Original home

Based on word stems that are common to all Indo-European languages, ethnolinguistics tries , in cooperation with archeology, to determine the area of ​​origin of the Indo-Europeans and to associate them with prehistoric peoples or cultures. When asked about an original home , however, a distinction must always be made between a hypothetical linguistic-historical reconstruction of local influencing variables in the context of the formation of the earliest tangible Indo-European root words and, in contrast, an identification of people , language and space ( continuity theory ).

Assumed Indo-Aryan migration with corresponding chronological assignment, beginning 4500 BC In the center the Yamnaja culture ( Yamna culture )

Some hypotheses are significantly shaped by nationalism or have been co-opted by an ideology (e.g. under National Socialism ). This applies e.g. B. also for many Indian scientists, who locate the Indo-European original home in India and thus at the same time the sponsorship z. B. the Harappa culture to the Indo- Aries or their Urindo-Iranisch - or even Urindo-European- speaking forerunners. Other extreme assumptions see the original home z. B. in Southeast Europe or east of the Urals to the Altai Mountains

Map of the Indo-European migration from approx. 4000 to 1000 BC BC ( Kurgan hypothesis). Immigration to Anatolia could have taken place either via the Caucasus (not shown) or via the Balkans.
  • Original home according to the Kurgan hypothesis
  • Indo-European speaking peoples up to 2500 BC Chr.
  • Settlement around 1000 BC Chr.
  • The Kurgan Hypothesis

    The Kurgan hypothesis is one of the hypotheses that assume a primordial home between the extremes. This includes above all areas around the Black Sea : the steppes with the Kurgan culture in the north, Transcaucasia in the east or Asia Minor (Anatolia) in the south.

    Linguists (e.g. JP Mallory (1989), A. Parpola (2008), RSP Beekes (2011)) tend predominantly to the steppe thesis, which is also supported by archaeological findings. The Transcaucasia hypothesis goes back to the linguists T. Gamqrelidze and W. Ivanov (1984).

    The Anatolia Hypothesis

    Distribution of y-DNA J2

    The Anatolia Hypothesis postulates the transfer of culture , especially for languages , agriculture and animal husbandry, to Europe through immigration from Anatolia. In a narrower sense, it is seen as the spread of an Indo-European original language from Anatolia to Europe through and with the Neolithic revolution . The British archaeologist C. Renfrew (1987) is considered to be the creator of the Anatolia hypothesis, according to which the original home is Anatolia.

    Renfrew (2003) assumes a gradual immigration of the Indo-European languages, also called the “Indo- Hittite model” . The modified hypothesis primarily integrates the latest findings on the genetics of European populations (spread of haplogroups );

    1. from 6500 BC The Neolithic expansion from Anatolia via the Balkan Peninsula ( Starčevo culture , Körös-Cris culture ) to the Central European ribbon ceramics took place;
    2. against 5000 BC With the spread of Copper Age cultures, the Indo-European languages ​​were divided into three parts in the Balkans, with a split into a north-western European branch (Danube region) and an eastern steppe branch (ancestors of the Tocharians ).
    3. only after 3000 BC The split of the language families from Proto-Indo-European ( Greek , Armenian , Albanian , Indo- Iranian , Baltic - Slavic ) took place.

    None of the origin hypotheses has so far found general acceptance. An overview of the discussion is provided by B. the Mallory student John Day.

    The Caucasus-Iran Hypothesis

    This thesis is strongly related to the Anatolia Hypothesis, but builds on linguistic and genetic data from 2018 and 2019. Analyzes of the spread of the Indo-European languages ​​point to an original home in the southern Caucasus and northern Iran.

    A genetic study ( Wang; Reich et al 2018 ) supports this thesis. According to the scientists, the DNA of the early Indo-Europeans matches the inhabitants of the southern Caucasus and northern Iran. According to them, these original Indo-Europeans migrated to Anatolia on the one hand and north to the southern steppe regions on the other, where the Yamna culture later developed.

    Genetics and Indo-European Studies

    Y-DNA distribution in Europe (haplogroups with less than 3% may be missing from the diagrams.)

    Population geneticists such as Luigi Luca Cavalli-Sforza try to elucidate the origins and relationships of peoples and languages using molecular genetic methods, in particular by researching the distribution of gene mutations ( haplogroups ).

    It is now assumed that the spread of the Indo-European language (s) is related to the spread of the Y-DNA haplogroups J2 , R1a and R1b . However, this only partially agrees with the archaeological findings ( corded ceramics versus bell beaker culture ).

    Indo-European and other language families

    Similarities between languages can be based on kinship, language contact or typological laws. Individual word similarities can also be due to chance. Such similarities can, but need not necessarily, be relevant to the question of the Indo-European original home.

    There are numerous hypotheses about the external relations of Indo-European. The scientific literature shows closer relationships to the languages ​​listed below (original languages ​​of language families) (see also the section on literature ).

    • to the Ural . These are likely to be based in particular on contact with eastern Indo-European and are in the area of ​​vocabulary (e.g. the pronominal system ) or morphology .
    • to Kartwelischen . The two languages ​​have similarities in the morphonological system.
    • to the Semitic . Although no special relationships are to be expected in the original home of Indo-European north of the Caucasus, some have been shown that even suggest an original relationship between the two languages.

    Some scientists, especially Soviet scientists, have tried to find evidence of a so-called nostratic language family, which is said to include the Indo-European as well as the Afro-Asian languages (formerly Hamitosemitic languages) and the Altaic languages , which are themselves controversial as a genealogical unit . This evidence is currently largely viewed as insufficient.

    Similarly, the American linguist Joseph Greenberg proposed a Eurasian macro-language family based on lexical and grammatical similarities . In particular, it includes the three relatively extensive Indo-European, Ural and Altaic language families as well as some small families and individual languages ​​of Eurasia, but expressly not Afro-Asian. This macro-language family thus partially coincides with the nostratic, whereby more fundamental similarities were found on both sides (Greenberg, Bomhard ). So far, a decision on the validity of the Eurasian hypothesis is also not possible.

    The branches of Indo-European

    The Indo-European languages ​​include the following groups of languages still spoken today :

    • Albanian (approx. 8 million speakers), possibly a successor language to Illyrian
    • Armenian (approx. 9 million speakers)
    • Baltic languages (2 languages still spoken today, approx. 5 million speakers)
    • Germanic languages (around 15 languages ​​with around 500 million native speakers, with second speakers almost 800 million speakers)
    • Greek (over 13 million native speakers)
    • Indo-Iranian languages
      • Indo-Aryan languages (over a hundred languages, around a billion speakers)
      • Iranian languages (around 50 languages, around 150-200 million native speakers, another 30-50 million second or third speakers)
      • Nuristani languages (6 languages ​​with a total of around 30,000 speakers)
    • Italian languages , all extinct except for those
      • Romance languages (around 15 languages, around 700 million native speakers, 850 million speakers including secondary speakers)
    • Celtic languages (about 6 languages, over 2.5 million, mostly second speakers, all but Welsh endangered)
    • Slavic languages (around 20 languages, around 300 million native speakers, 400 million speakers including secondary speakers), possibly together with the Baltic, form the unit " Baltoslawisch "

    Two other important groups are extinct (†):

    In addition, the following languages ​​have only survived in fragments , whose affiliation to the Indo-European language family is beyond doubt, but whose precise assignment to other languages ​​is controversial:

    • Illyrian † (possibly the precursor of Albanian)
    • Lusitan † (possibly Celtic or more closely related to Celtic)
    • Macedonian † (possibly more closely related to Greek)
    • Messapish † (possibly more closely related to Illyrian)
    • Phrygian † (shows common developments with Greek and Armenian)
    • Siculian † (possibly Italian)
    • Thracian † (with the dialects Dacian, Getisch, Moesisch)
    • Venetian † (possibly belonging to Italian)

    Some languages ​​that have survived fragmentarily cannot be identified with certainty as Indo-European:

    • Elymic † (possibly belonging to Italian)
    • Northern Pikachian † (possibly Sabellian or Greek)
    • Camunian † (possibly Celtic)
    • Ligurian † (disputed due to insufficient linguistic evidence, whether pre-Indo-European, Indo-European or even Celtic)
    • Tartessian † (possibly Celtic)

    Going back to Peter von Bradke (1890), the Indo-European languages ​​are divided into so-called Kentum and Satem languages ​​according to the individual criterion of the development of the palatalized / k '/ (e.g. in the numeral * k'mtom ' hundred ') . The original assumption that this classification was based on a dialect isogloss of the Indo-European original language turned out to be untenable with the discovery of Tocharian at the beginning of the 20th century, but was still partially supported for a few decades. As a purely descriptive criterion, the classification is still alive today.

    Family relationships


    Since Schleicher , attempts have been made again and again to reduce the above-mentioned subgroups to common intermediate languages. Only a few have prevailed, especially the combination of the Indo-Aryan and Iranian languages as Indo-Iranian languages . The Baltic- Slavic language group ( Balto-Slavic hypothesis ) is also widely recognized ; A closer relationship between the Italian and the Celtic languages , the assignment of the Venetian to both Illyrian and Italian languages, a Thracian - Phrygian language community , the descent of Albanian from Illyrian, the group of Balkan Indo-European (Greek, Armenian, Albanian) remain in dispute ) and much more.

    Therefore, in the above list, no more precise assignments are made, disputes continue as individual groups without any indication of suspected family relationships.


    The archaisms of Urindo-European are only preserved today in a few of the modern successor languages. Languages ​​can show themselves to be conservative in some properties, but show major changes in others. Opinions that a language is particularly conservative (e.g. often used for Lithuanian ) must therefore relate to specific properties and cannot be generalized.

    Spread of the Indo-European languages

    The maps geographically illustrate one of the propagation hypotheses described above up to approx. 500 AD.


