Virtual Language Observatory

from Wikipedia, the free encyclopedia

The Virtual Language Observatory (VLO) is a meta search engine for scientific language data. The search engine is developed and operated in the context of the European research infrastructure project CLARIN . It currently has over 1.6 million entries (as of January 2018). The data come from the CLARIN data centers in various countries, but also from other sources accessible online, provided they have licenses for free use and certain metadata formats . The aim is to make already collected data / sources available for further research and to make them compatible with each other.

Search options

The search interface allows both free-text search in the existing metadata , and a faceted search . Using the facets, resources can be selected or filtered according to the following criteria: language, collection (such as Europeana Newspapers), resource type (such as chanson, chronicle, comedy), modality (such as e.g. spoken language, sign language, gestures, writing), data format (such as text, audio, xml), as well as keyword / key word and usage restrictions.

Availability

The data recorded in the VLO have certain licenses that regulate the conditions for subsequent use. Some of the resources are freely accessible, others are released for academic use, for which users can register with a federated identity of their institutions. Some data can only be used after obtaining personal permission.

Technical background

The Virtual Language Observatory uses the Component Metadata Infrastructure (CMDI) developed in CLARIN. Existing metadata must be adapted so that resources can be identified in the VLO.

Web links

literature

  • Haaf, S .; Fankhauser, P .; Trippel, T .; Eckart, K ​​.; Eckart, T .; Hedeland, H .; Herold, A .; Knappen, J .; Schiel, F .; Stegmann, J .; Uytvanck, DV (2014): CLARIN's Virtual Language Observatory (VLO) under scrutiny - The VLO taskforce of the CLARIN-D centers. In: CLARIN Annual Conference, Soesterberg, Netherlands. PDF [3] .
  • Van Uytvanck, D., Stehouwer, H., & Lampen, L. (2012). Semantic metadata mapping in practice: The Virtual Language Observatory. In N. Calzolari (Ed.), Proceedings of LREC 2012: 8th International Conference on Language Resources and Evaluation (pp. 1029-1034). European Language Resources Association (ELRA). [4]

Individual evidence

  1. See explanations on availability on the VLO help pages (in English)
  2. Broeder, D. (et al.) (2012): CMDI: a Component Metadata Infrastructure. In: V. Arranz et al. (Ed.): Proceedings of the LREC 2012 Workshop Describing Language Resources with Metadata: Towards Flexibility and Interoperability in the Documentation of Language Resources., Istanbul, Turkey, 1-4. PDF: [1]  ( Page no longer available , search in web archivesInfo: The link was automatically marked as defective. Please check the link according to the instructions and then remove this notice. . See also Introduction to Component Metadata on the European CLARIN website (in English) [2]@1@ 2Template: Toter Link / lrec-conf.org