Highlight Matched Text for query term in solr

405 views Asked by At

I have solr jetty 5.1.3 installed and indexed more than 15000 documents using tika. I have indexed and stored doc published date and content in SOLR. I have enable highlighted in solrConfig.xml, Here is the xml of request handler for highlighted terms

<requestHandler name="/select" class="solr.SearchHandler">
    <!-- default values for query parameters can be specified, these
         will be overridden by parameters in the request
      -->
     <lst name="defaults">
       <str name="echoParams">explicit</str>
       <int name="rows">10</int>
       <str name="hl">on</str>
       <str name="hl.fl">content</str>
       <str name="hl.simple.pre">&lt;b&gt;</str>
       <str name="hl.simple.post">&lt;/b&gt;</str>
       <str name="f.content.hl.snippets">3</str>
       <str name="f.content.hl.fragsize">200</str>
       <str name="f.content.hl.maxAnalyzedChars">200000</str>
       <str name="f.content.hl.alternateField">content</str>
       <str name="f.content.hl.maxAlternateFieldLength">750</str>
     </lst>

    </requestHandler>

  <!-- A request handler that returns indented JSON by default -->
  <requestHandler name="/query" class="solr.SearchHandler">
     <lst name="defaults">
       <str name="echoParams">explicit</str>
       <str name="wt">json</str>
       <str name="indent">true</str>
       <str name="df">content</str>
       <str name="hl">on</str>
       <str name="hl.fl">content</str>
       <str name="hl.simple.pre">&lt;b&gt;</str>
       <str name="hl.simple.post">&lt;/b&gt;</str>
       <str name="f.content.hl.snippets">3</str>
       <str name="f.content.hl.fragsize">200</str>
       <str name="f.content.hl.maxAnalyzedChars">200000</str>
       <str name="f.content.hl.alternateField">content</str>
       <str name="f.content.hl.maxAlternateFieldLength">750</str>
     </lst>
  </requestHandler>

It is returning me up to three highlights and search text is bold. like if i search "Lorem" in query term, then it is returning a highlight to me something like that

Lorem ipsum dolor sit amet 2016, consectetur adipiscing elit. Sed volutpat metus lorem, a placerat nibh sodales in. Cras in mauris tempus, vulputate felis eu, tincidunt erat.

But when i search the doc which have publish date between last 1 year and now, it is highlighting two terms. For example, if i search " "Lorem" and docPublishDate:[2015-01-20 TO 2016-01-20] " Then it is returning a highlights to me something like that:

Lorem ipsum dolor sit amet 2016, consectetur adipiscing elit. Sed volutpat metus lorem, a placerat nibh sodales in. Cras in mauris tempus, vulputate felis eu, tincidunt erat.

I don't want that solr highlight 2016 text also. I want that it only bold the Lorem. What should i do to achieve it?

1

There are 1 answers

0
MatsLindh On BEST ANSWER

Use a filter query to limit the set of documents to be returned instead - filters given as fq parameters are not used for highlighting.

You can also use the hl.q parameter to use a specific query for highlighting, so you could also submit the query to the highlighter without the date part - but this case seems to be better suited to using a filter query.