What is Apache Lucene ?
Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform. It is an open source project available for free download.
As I was submitted to the Google Summer of Code 2016 (GSoC), I tried to install Apache Lucene 6.0.0 for learning basical concept before the final answer at 25th April.
Download Apache Lucene 6.0.0
First, I’ve downloaded the latest Lucene distribution (6.0.0) and then extract
it to my GSoC working directory located at
Include jars for demo
Assume that we’re located at the root path of Apache Lucene installation. THere’re 4 jars that should be included for the demo, they’re :
- the Lucene JAR
- the queryparser JAR
- the common analysis JAR
- the Lucene demo JAR
Use the linux commandline
export to add jars into java classpath :
Once I’ve done that, I should now build an index! Assuming I’m currently located
at the home of lucene, then tape the following command the build index for
docs. Please notice that the official tutorial suggests to use
folder. But this folder is not avaible to Apache Lucene 6.0.0 installation
lucene-6.0.0.tgz). So use another folder if you’re in the same
situation, such as
java org.apache.lucene.demo.IndexFiles -docs docs
This will produce a subdirectory called
index which will contain an index of
all of the Lucene source code.
We can search index / results using the following commandline :
Here’re the search results for keyword
huangmincong and keyword
Tomorrow, I’ll learn more about how Lucene works,especially the
Directory and the