How does the Lucene 4.3.1 highlighter work? I want to print out the search results(as the searched word and 8 words after that word) from the document. How can I use the Highlighter class to do that? I have added full txt, html and xml documents to a file and added those into my index, now I have a search formula, from which I will presumably be adding the highlighter capability:
String index = "index";
String field = "contents";
String queries = null;
int repeat = 1;
boolean raw = true; //not sure what raw really does???
String queryString = null; //keep null, prompt user later for it
int hitsPerPage = 10; //leave it at 10, go from there later
//need to add all files to same directory
index = "C:\\Users\\plib\\Documents\\index";
repeat = 4;
IndexReader reader = DirectoryReader.open(FSDirectory.open(new File(index)));
IndexSearcher searcher = new IndexSearcher(reader);
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_43);
BufferedReader in = null;
if (queries != null) {
in = new BufferedReader(new InputStreamReader(new FileInputStream(queries), "UTF-8"));
} else {
in = new BufferedReader(new InputStreamReader(System.in, "UTF-8"));
}
QueryParser parser = new QueryParser(Version.LUCENE_43, field, analyzer);
while (true) {
if (queries == null && queryString == null) { // prompt the user
System.out.println("Enter query. 'quit' = quit: ");
}
String line = queryString != null ? queryString : in.readLine();
if (line == null || line.length() == -1) {
break;
}
line = line.trim();
if (line.length() == 0 || line.equalsIgnoreCase("quit")) {
break;
}
Query query = parser.parse(line);
System.out.println("Searching for: " + query.toString(field));
if (repeat > 0) { // repeat & time as benchmark
Date start = new Date();
for (int i = 0; i < repeat; i++) {
searcher.search(query, null, 100);
}
Date end = new Date();
System.out.println("Time: "+(end.getTime()-start.getTime())+"ms");
}
doPagingSearch(in, searcher, query, hitsPerPage, raw, queries == null && queryString == null);
if (queryString != null) {
break;
}
}
reader.close();
}
For the Lucene highlighter to work you need to add two fields in your document that you are indexing. One field should be with Term Vector enabled and another field without using Term Vector. For simplicity I am showing you a code snippet:
After enabling them add that document in your index. Now to make use of lucene highlighter use the method given below (It uses Lucene 4.2, I have not tested with Lucene 4.3.1) :