Apache Lucene
Apache Lucene
|
|
---|---|
Basic data
|
|
developer | Apache Software Foundation |
Current version |
8.5.2 ( May 26, 2020 ) |
operating system | Platform independence |
programming language | Java |
category | Program library |
License | Apache license |
German speaking | No |
lucene.apache.org |
Apache Lucene is a program library for full-text search . Lucene is free software and a project of the Apache Software Foundation .
Lucene is used by Wikipedia (only directly, since 2014 via Elasticsearch ) . Twitter in particular provides an example of Lucene's performance and scalability .
history
Lucene was developed by Doug Cutting and was initially available through SourceForge since 1997 . The name Lucene is the middle name of Doug Cutting's wife.
In 2001 Lucene became part of the Jakarta Project and in 2005 a major project of the Apache Software Foundation. The Apache Lucene project occasionally gives rise to separately continued projects.
Projects based on Lucene
Lucene Core
- The core of the project Lucene, Lucene Core or Lucene for short , formerly also called Lucene Java , is a program library that is written in the Java programming language .
- On the one hand, Lucene creates an index of files that is about a quarter of the volume of the indexed files. On the other hand, Lucene then provides search results with a ranking list, for which several search algorithms are available.
Lucene.Net
- Lucene.Net is a translation by Lucene into the programming language C # with adaptation of the programming interface to the .NET platform.
Lucy
- Lucy is a port from Lucene to the C programming language for language connections to dynamic programming languages such as Perl .
PyLucene
- PyLucene is an extension of Python to include a wrapper with a Java runtime environment for Lucene.
Droids
Solr
- Solr is a Lucene based standalone implementation of a search server. Solr was originally developed by CNET and called Solar. The name was an abbreviation for Search on Lucene and Resin . The Solr download includes a configuration with Jetty as an example . Solr includes a REST-like API. Solr communicates using the Hypertext Transfer Protocol . Using HTTP POST, various file formats from XML to JSON to PDF can be recorded and documents can also be created. Queries are made using HTTP GET.
Tika
- Tika used to belong to the Lucene project, is used by Solr and is a parser . It extracts metadata or structured text from a range of document formats using specialized (if possible, existing) libraries such as Apache PDFBox or Apache POI , which are uniformly addressed via Tika and can be selected automatically.
Nutch
Outside the project, other Lucene derivatives were created.
functionality
Lucene uses the Tf-idf measure and vector space retrieval to evaluate search hits.
literature
- Manfred Hardt, Fabian Theis: Developing search engines with Apache Lucene. Developer. Press, 2004.
- Erik Hatcher et al .: Lucene in Action. Manning, 2005 (about Lucene 1.4), 2nd ed. 2010 (about Lucene 3.0).
- Florian Hopf: Flexible search with Lucene. In: Java aktuell. Issue 4-2013, p. 31 ff.
Web links
- www.lucenetutorial.com - English language introduction
- www.jaxenter.de - Apache Solr and ElasticSearch
Individual evidence
- ↑ 26 May 2020 - Apache Lucene ™ 8.5.2 available . (accessed on July 5, 2020).
- ↑ Twitter Engineering: Twitter Search is Now 3x Faster . Twitter. April 6, 2011. Retrieved September 5, 2015.
- ^ Ten years of the Lucene search engine at Apache . Hot. September 27, 2011. Retrieved January 6, 2012.
- ↑ LuceneFAQ . Apache Software Foundation. Retrieved January 6, 2012.
- ↑ Apache Lucene Features . Apache Software Foundation. Retrieved January 6, 2012.
- ↑ Welcome to PyLucene . Apache Software Foundation. Retrieved January 6, 2012.
- ↑ Apache Droids Incubation Status - Apache Incubator. In: incubator.apache.org. Retrieved December 16, 2016 .
- ↑ Apache Solr -. Retrieved October 10, 2019 .
- ↑ FAQ - Solr Wiki . Apache Software Foundation. Retrieved January 6, 2012.
- ↑ Interview with Ian Holsman of Relegence (AOL) . Lucidworks. Retrieved August 31, 2015.
- ↑ Apache Solr Features. Retrieved October 10, 2019 .
- ^ Solr tutorial . Apache Software Foundation. Retrieved January 6, 2012.
- ↑ Lucene Implementations . Apache Software Foundation. Retrieved January 6, 2012.
- ^ Lucene's Practical Scoring Function. Elasticsearch: The Definitive Guide [2.x]. Elastic, accessed January 1, 2020 (American English).