Subset

A sub- clause is understood to be any simple main clause or subordinate clause that is included as a component in a larger sentence structure or a sentence period . Thus, every subordinate clause is a sub-clause of the larger structure, but also main clauses that are connected by assignment (e.g. and, but, because ) and parentheses (inserted clauses).

An example

"Any sum can currently be asked," says one of them, if you only spare people the trouble with the tax authorities. "(Quotation from Der Spiegel ; No. 26, 2008, p. 55) This complex sentence, a sentence structure , exists from three clauses , clearly marked by the commas.

Criterion for subset

Part of a sentence can only be a phrase that largely meets the minimum requirements for a sentence . For this purpose, subject and predicate must typically be present in German, for example, as well as the extensions that the predicate makes necessary. Restrictions are permitted insofar as ellipses are also considered to be block-shaped. In practical work, sub-clauses are in many cases approximately defined by a simple instruction - operational -: A sentence then has as many sub-clauses (in quantitative linguistics often referred to with the term clause taken from English ) as there are finite verbs (= verbs in a personal form). If you apply this criterion to the example sentence of the previous section, the three sub-clauses are determined using the finite verbs “may”, “says” and “spare”. However, this method is only approximate, because some infinitive constructions also have the status of subordinate clauses (see under subordinate clause # Infinitive clauses ).

Linguistic meaning of the sub-clauses

Just like other linguistic units, sub-clauses contribute to the stylistic characteristics of texts due to their type, complexity and frequency. In quantitative linguistics , two aspects are in the foreground: the frequency with which sub-clauses of different lengths occur in texts (distribution of sub-clause lengths ) and the ratio of the sentence length to the sub-clause length or that of the sub-clause length to the length of the constituents (components) of the sub-clauses ( especially: phrases , clauses , words ). Instead of using partial sentences, the related concept of the Claus length is sometimes used .

As an example, the data are shown that were obtained using medical textbooks; the partial sentence lengths are represented in it as well as in some other text classes according to the positive negative binomial distribution. The data are from Schefe (1975); the adjustment of the distribution from Best (2006):

x	n (x)	NP (x)
2	33	33.00
3	218	234.79
4th	172	158.68
5	105	95.81
6th	58	54.41
7th	22nd	29.72
8th	13	15.81
9	10	8.24
10 and more	8th	8.54

In the table, x is: number of partial sentences per sentence, n (x) the number of sentences of length x observed in the corpus evaluated; NP (x) is the number of records of length x that is computed when fitting the positive negative binomial distribution to the observed data. The test shows with P = 0.27 that the positive negative binomial distribution is a good model for the observed data. The result of such a test is rated as good if P ≥ 0.05, which is the case in this case. For more detailed explanations, please refer to the literature given.

Development of the lengths of subsets

Just like word length and sentence length, the length of subsets is a quantity that changes over time. In German-language scientific and technical texts between 1770 and 1940, there is a trend in which the length of the partial sentence as well as the length of the sentence initially increased and then decreased again from 1850, as Möslein found. The author understands main and subordinate clauses, but also infinitive and participle constructions as partial clauses. These changes in linguistic usage follow Piotrowski's law in its form for the reversible language change, as the following table shows.

t	time	Words per sub-sentence (observed)	Words per subset (calculated)
1	1770	9.4	9.70
4th	1800	11.3	11.03
9	1850	12.7	12.51
14th	1900	11.8	12.18
16	1920	11.4	11.55
18th	1940	11.1	10.74
20th	1960	11.9	-

(Explanation: t is the period of time numbered after decades for the calculation. If one adapts the Piotrowski law in the form for the reversible language change to the data observed up to 1940, the calculated values given result. The date 1960 is not taken into account because Due to the available data, it is unclear whether this indicates a trend reversal or whether it is simply an “outlier.” Adaptation of the model results in a coefficient of determination of C = 0.92, with C considered good if it is greater than / equal to 0.80. For more detailed explanations, please refer to the literature given.)

literature

Helmut Glück (Ed.): Metzler Lexicon Language . 4th, updated and revised edition. JB Metzler, Stuttgart et al. 2010, ISBN 978-3-476-02335-3 .
Wilfried Kürschner : Grammatical Compendium. Systematic index of basic grammatical terms. 3rd, increased and revised edition. Francke, Tübingen et al. 1997, ISBN 3-8252-1526-1 , pp. 216-220 ( UTB for science. Uni-Taschenbuch. Linguistik 1526).

Web links

Wiktionary: sub-clause - explanations of meanings, word origins, synonyms, translations

Individual evidence

↑ Cf. Brigitta Niehaus: Investigation of the sentence length frequency in German . In: Karl-Heinz Best (Ed.): Glottometrika 16 . Wissenschaftlicher Verlag Trier, Trier 1997, pp. 213-275. On “Clause”: p. 221. ISBN 3-88476-276-1

↑ So with Peter Schefe: Statistical syntactic analysis of technical languages with the help of electronic computers using the example of medical, business and literary scientific language in German. Kümmerle, Göppingen 1975. ISBN 3-87452-293-8 . (Extended and revised version of the dissertation.)

↑ Karl-Heinz Best: Distribution of phrase and sub-clause lengths in German technical language . In: Naukovyj Visnyk Černivec'koho Universytetu: Herman'ska filolohija . Vypusk 319-320, 2006, pp. 113-120.

↑ For example: Karl-Heinz Best: Quantitative Linguistics. An approximation . 3rd, heavily revised and expanded edition. Peust & Gutschmidt, Göttingen 2006, ISBN 3-933043-17-4 , p. 27ff.

↑ Kurt Möslein: Some development tendencies in the syntax of scientific-technical literature since the end of the 18th century. In: Walther von Hahn (editor): Technical languages. Wissenschaftliche Buchgesellschaft, Darmstadt 1981, pages 276–319, for partial sentences page 304. ISBN 3-534-07141-7 . First published in 1974.

^ Karl-Heinz Best: Sentence lengths in German: distributions, mean values, language change . In: Göttinger Contributions to Linguistics 7, 2002, pages 7–31, on the development of sub-sentence lengths, page 26f. Table slightly corrected for this representation.

^ Gabriel Altmann : The Piotrowski law and its generalizations. In: Karl-Heinz Best , Jörg Kohlhase (Ed.): Exact language change research. Theoretical contributions, statistical analyzes and work reports (= Göttinger Schriften zur Sprach- und Literaturwissenschaft. Vol. 2). edition herodot, Göttingen 1983, ISBN 3-88694-024-1 , pages 54-90, on the reversible language change: page 78ff.

[1] Cf. Brigitta Niehaus: Investigation of the sentence length frequency in German . In: Karl-Heinz Best (Ed.): Glottometrika 16 . Wissenschaftlicher Verlag Trier, Trier 1997, pp. 213-275. On “Clause”: p. 221. ISBN 3-88476-276-1

[2] So with Peter Schefe: Statistical syntactic analysis of technical languages with the help of electronic computers using the example of medical, business and literary scientific language in German. Kümmerle, Göppingen 1975. ISBN 3-87452-293-8 . (Extended and revised version of the dissertation.)

[3] Karl-Heinz Best: Distribution of phrase and sub-clause lengths in German technical language . In: Naukovyj Visnyk Černivec'koho Universytetu: Herman'ska filolohija . Vypusk 319-320, 2006, pp. 113-120.

[4] For example: Karl-Heinz Best: Quantitative Linguistics. An approximation . 3rd, heavily revised and expanded edition. Peust & Gutschmidt, Göttingen 2006, ISBN 3-933043-17-4 , p. 27ff.