Text Encoding Initiative

from Wikipedia, the free encyclopedia
Official logo

The Text Encoding Initiative (TEI) is an organization founded in 1987 ( organized as a TEI consortium since 2000 ) and a document format of the same name for encoding and exchanging texts, which it developed and continues to develop. In the current version P5 , the format is based on XML and is defined in a metalanguage from which formal schemas such as DTD , XML schema and RELAX NG schema can be derived.

TEI has developed into a de facto standard within the humanities , where it is used, for example, to encode printed works ( edition science ) or to mark linguistic information ( linguistics ) in texts.

history

TEI has been since 1988 on the basis of SGML developed the first draft P1 ( P for English proposal - proposal ) was published in 1990. After an interim version of P2 (1992), contained the enhancements and fixes, in 1994, which in turn enhanced TEI version P3 - the first stable version - adopted. With the development and dissemination of XML, TEI also had to evolve. For this purpose, the TEI consortium was founded in 2000. The first XML version P4 appeared in 2002, at the same time the TEI Lite version was created with a slimmed-down range of elements. Version P5 has been developed since 2005 and was released on November 1, 2007. It has been thoroughly revised technically and expanded in terms of content, including a standard for describing manuscripts ( MASTER ).

technology

TEI is made up of various subject-related modules that contain , for example, elements for the document structure, for marking poems and dramas, for marking individual lines and pages, for tables, for text-critical comments or for language corpora , terminologies and dictionaries . There is a core of modules that <p/>contain general elements as for paragraphs. Depending on the project, this core can be expanded to include the required modules that enable very differentiated labeling of text features. The TEI scheme for a specific application is itself defined as a TEI document in a metalanguage (called ODD document: One Document Does it all ). Formal schemas such as DTD , XML schema and Relax-NG schema can be generated automatically from the ODD document . The TEI websites offer tools both for adapting TEI and for generating the schemes.

Examples

Hello World!

<?xml version="1.0" encoding="UTF-8"?>
<TEI xmlns="http://www.tei-c.org/ns/1.0">
    <teiHeader>
        <fileDesc>
            <titleStmt>
                <title>Hallo Welt!</title>
            </titleStmt>
            <publicationStmt>
                <p>Demo für Wikipedia</p>
            </publicationStmt>
            <sourceDesc>
                <p>Originales Werk, keine Vorlage</p>
            </sourceDesc>
        </fileDesc>
    </teiHeader>
    <text>
        <body>
            <p>Hallo Welt!</p>
        </body>
    </text>
</TEI>

Practical example

The following example encodes a poem with detailed bibliographical information as well as information on line and page counts (TEI Lite).

<?xml version="1.0" encoding="UTF-8"?>
<TEI xmlns="http://www.tei-c.org/ns/1.0">
    <teiHeader>
        <fileDesc>
            <titleStmt>
                <title>Auf dem Brocken</title>
                <author>Heinrich Heine (1797–1856)</author>
                <respStmt>
                    <name>Wiki Autor</name>
                    <resp>Umwandlung in TEI-konformes XML</resp>
                </respStmt>
            </titleStmt>
            <publicationStmt>
                <p>aus Wikisource, der freien Quellensammlung
                    (<ptr target="http://de.wikisource.org/wiki/Auf_dem_Brocken"/>)</p>
            </publicationStmt>
            <sourceDesc>
                <biblFull>
                    <titleStmt>
                        <title level="a">Auf dem Brocken</title>
                        <title level="m">Buch der Lieder</title>
                        <title level="m" type="sub">Aus der Harzreise</title>
                        <author>Heine, Heinrich</author>
                    </titleStmt>
                    <publicationStmt>
                        <publisher>Hoffmann und Campe</publisher>
                        <pubPlace>Hamburg</pubPlace>
                        <date>1827</date>
                        <availability>
                            <p>Gemeinfrei, keine Nutzungsbeschränkungen</p>
                        </availability>
                    </publicationStmt>
                </biblFull>
            </sourceDesc>
        </fileDesc>
    </teiHeader>
    <text>
        <body>
            <pb n="302"/>
            <head>Auf dem Brocken.</head>
            <lg type="stanza">
                <l>Heller wird es schon im Osten</l>
                <l>Durch der Sonne kleines Glimmen,</l>
                <l>Weit und breit die Bergesgipfel,</l>
                <l>In dem Nebelmeere schwimmen.</l>
            </lg>
            <lg type="stanza">
                <l n="5">Hätt’ ich Siebenmeilenstiefel,</l>
                <l>Lief ich, mit der Hast des Windes,</l>
                <l>Ueber jene Bergesgipfel,</l>
                <l>Nach dem Haus des lieben Kindes.</l>
            </lg>
            <lg type="stanza">
                <l>Von dem Bettchen, wo sie schlummert,</l>
                <l n="10">Zög’ ich leise die Gardinen,</l>
                <l>Leise küßt’ ich ihre Stirne,</l>
                <l>Leise ihres Munds Rubinen.</l>
            </lg>
            <lg type="stanza">
                <l>Und noch leiser wollt’ ich flüstern</l>
                <l>In die kleinen Lilien-Ohren:</l>
                <l n="15">Denk’ im Traum, daß wir uns lieben,</l>
                <l>Und daß wir uns nie verloren.</l>
            </lg>
        </body>
    </text>
</TEI>

See also

Web links

Individual evidence

  1. Matthew L. Jockers, Rosamond Thalken: Text Analysis with R: For Students of Literature (=  Quantitative Methods in the Humanities and Social Sciences ). Springer International Publishing, Cham 2020, ISBN 978-3-03039642-8 , pp. 134 , doi : 10.1007 / 978-3-030-39643-5 ( springer.com [accessed April 27, 2020]).
  2. ^ P5: Guidelines for Electronic Text Encoding and Interchange. Historical Background tei-c.org
  3. ^ P5: Guidelines for Electronic Text Encoding and Interchange. The TEI Infrastructure tei-c.org