Browse DevX
Sign up for e-mail newsletters from DevX


Lucene: Add Indexing and Search to Your Web Apps : Page 2

Get a crash course in using Lucene, an open source Java library that enables you to add indexing and search capabilities to your Web applications and documents.

Creating a Lucene Index
The example file MakeIndex.java in the directory "simple_example" shows the few lines of code required to create a new Lucene index. This demo program reads all test files in the directory "example_text_files" and adds them to the index. The following few lines of code in the example program use the Lucene class libraries.

Create a Lucene IndexWriter instance:

IndexWriter indexWriter = new IndexWriter("index", new StandardAnalyzer(), true);

This creates the index in the directory "index". The standard analyzer tokenizes text and discards some common noise words. The third argument is a flag to indicate that Lucene should delete any existing indices in the "index" directory. You would set this flag to false if you wanted to add to an existing index.

Create a new Document instance and add it to the index:

Document document = new Document();
document.add(Field.Text("text", new FileReader(fullPath)));
document.add(Field.UnIndexed("filepath", fullPath));

The second argument to Field.Text can be the string of text to index (in which case it is stored in the index) or a file reader (in which case the text is read and indexed, but not stored in the index).

Running the MakeIndex Program
When you run the make index program with the three sample text files that I put in the "example_text_files" directory, you see the following output:

Wrote file ./example_text_files/AI_Go_Consciousness.txt to index.
Wrote file ./example_text_files/Jumpstarting the Semantic.txt to index.
Wrote file ./example_text_files/Loving Lisp.txt to index.

The Lucene index is written to the "index" directory.

Searching an Existing Lucene Index
Create a new search instance and a standard text analyzer:

Searcher searcher = new IndexSearcher("index");
Analyzer analyzer = new StandardAnalyzer();

Note that you specify that the existing index is stored in the directory "index". The MakeIndex program created this index.

The following code performs a query on a line of text the user enters and prints out the search results (assuming an input stream in):

  String line = in.readLine();
  Query query = QueryParser.parse(line, "text", analyzer);
  System.out.println("Searching for: " + query.toString("text"));
  Hits hits = searcher.search(query);
  System.out.println("Number of matching documents = " + hits.length());
  for (int i = 0; i < hits.length(); i++) {
    Document doc = hits.doc(i);
    System.out.println("File: " + doc.get("filepath") + ", score: " + hits.score(i));

When you create a search query, you specify that the search should be performed on any text data in the document filed "text". There is nothing special about the name "text"—it is just the field name that you specified in the MakeIndex program.

Running the SearchText Program
The following text shows the input (in bold text) and output from the example program:

Search query (enter a blank query to stop) : AI Go
Searching for: ai go
Number of matching documents = 1
File: ./example_text_files/AI_Go_Consciousness.txt, score: 0.3521486
Search query (enter a blank query to stop) : Lisp cons
Searching for: lisp cons
Number of matching documents = 1
File: ./example_text_files/Loving Lisp.txt, score: 0.18363969
Search query (enter a blank query to stop) : 

Thanks for your registration, follow us on our social networks to keep up-to-date