Browse Definitions :
Definition

Apache Lucene

Contributor(s): Stan Gibilisco

Apache Lucene is a freely available information retrieval software library that works with fields of text within document files. This evolving venture is also called the Apache Lucene Project. Apache is a server that is distributed under an open source license.

The Lucene application program interface (API) stays the same regardless of the format of the file to be indexed. Provided that the text information can be recovered and extracted, Lucene can index practically any type of text-containing document. Lucene has become popular for use in Internet search engines as well as for single-site search operations.

The Apache Lucene Project comprises four main components:

  • Lucene Core: Indexing, searching, spell checking, hit highlighting, and tokenization.
  • PyLucene: Python port for Lucene Core.
  • Solr: Extensible Markup Language (XML), Hypertext Transfer Protocol (HTTP), and APIs for Javascript Object Notation (JSON), Python, and Ruby, as well as hit highlighting, faceted search, caching, replication, and an interface for Web site administrators.
  • Open Relevance Project: Free distribution of materials for performance testing and relevance evaluation.
This was last updated in May 2013

Continue Reading About Apache Lucene

Join the conversation

1 comment

Send me notifications when other members comment.

Please create a username to comment.

can somebody please explain about what is the meaning of ranking in lucene. why we use it.
Cancel

-ADS BY GOOGLE

File Extensions and File Formats

SearchCompliance

SearchSecurity

  • cybersecurity

    Cybersecurity is the protection of internet-connected systems, including hardware, software and data, from cyberattacks.

  • asymmetric cryptography (public key cryptography)

    Asymmetric cryptography, also called public key cryptography, uses a pair of numerical keys that are mathematically related to ...

  • digital signature

    A digital signature is a mathematical technique used to validate the authenticity and integrity of a message, software or digital...

SearchHealthIT

SearchDisasterRecovery

  • business continuity plan (BCP)

    A business continuity plan (BCP) is a document that consists of the critical information an organization needs to continue ...

  • disaster recovery team

    A disaster recovery team is a group of individuals focused on planning, implementing, maintaining, auditing and testing an ...

  • cloud insurance

    Cloud insurance is any type of financial or data protection obtained by a cloud service provider. 

SearchStorage

  • hard disk drive (HDD)

    A computer hard disk drive (HDD) is a non-volatile memory hardware device that controls the positioning, reading and writing of ...

  • byte

    In most computer systems, a byte is a unit of data that is eight binary digits long. Bytes are often used to represent a ...

  • network-attached storage (NAS)

    Network-attached storage (NAS) is dedicated file storage that enables multiple users and heterogeneous client devices to retrieve...

Close