Text Engineering Software Laboratory

from Wikipedia, the free encyclopedia
Tesla

logo
Basic data

developer University of Cologne
operating system platform independent
programming language Java
category Natural language processing
License Eclipse Public License
tesla.spinfo.uni-koeln.de

Tesla ( Text Engineering Software Laboratory , German laboratory for processing texts ) is software that can be used to carry out reproducible experiments on textual data. Textual data are all types of data that can be represented by a sequence of discrete units.

Tesla has been developed since 2005 at the Institute for Linguistics at the University of Cologne (Linguistic Information Processing Department) and provides a software environment for scientists who work with texts.

The conceptual focus of the framework is on experimental data and process analysis; this is how scientists are supported

  • to apply established as well as newly developed procedures to these texts and
  • to document the experiments in a form with which they can be reproduced and repeated.

Tesla is implemented as a component system in Java , which was implemented on the basis of a client-server architecture . The user can manage texts and design experiments via the Eclipse -based client. Experiments consist of the starting material to be analyzed (individual texts or text collections) and components that take on certain tasks of text processing ( e.g. tokenization , part-of-speech tagging or sequence alignment ). The components can be combined with one another if their interfaces are coordinated. The interfaces of the components are the results they generate, which are linked to the raw data (texts) as annotations . In contrast to comparable systems such as UIMA , the input and output interfaces of Tesla components are hardly restricted, which enables finely granulated component encapsulation, and it is also possible, for example, to add complex data types (such as graphs or high-dimensional vectors ) as annotations use.

Screenshot of the Tesla client with the graphical experiment editor open

literature

  • Jürgen Hermes, Stephan Schwiebert: "Classification of text processing components: The Tesla Role System." In: Fink, Lausen, Seidel and Ultsch: "Advances in Data Analysis, Data Handling and Business Intelligence", Springer Verlag 2010 Abstract
  • Jürgen Hermes: "Text processing: design and application." Dissertation, University of Cologne. PDF document
  • Stephan Schwiebert: "Tesla. A virtual laboratory for experimental computer and corpus linguistics." Dissertation, University of Cologne. PDF document

Web links