Solr spellchecking nor returning results for indexes larger than 10k items

193 views Asked by At

I have a Solr core (Solr version 6.4.1) where I'm also using a spellcheck component. Problem is, as long as I have less than 30k items my spellchecker works fine. Increasing the number of docs to 30k or more causes the spellcheck not to return any result. I'm aware of parameters in solrconfig.xml file, such as maxQueryFrequency or thresholdTokenFrequency, but altering them did not solve the problem.

I also read these: Apache Solr : Search is not returning result for large document indexed, Solr spellchecker not returning any results, solr suggester not returning any results and Solr spellcheckin randomly working, but they didn't help neither.

These are the relevant parts in solrconfig.xml:

<searchComponent name="spellcheck" class="solr.SpellCheckComponent">
  <str name="queryAnalyzerFieldType">text_general</str>
  <lst name="spellchecker">
    <str name="name">default</str>
    <str name="field">_spellcheck_</str>
    <str name="classname">solr.DirectSolrSpellChecker</str>
    <str name="distanceMeasure">internal</str>
    <float name="accuracy">0.5</float>
    <int name="maxEdits">2</int>
    <int name="minPrefix">1</int>
    <int name="maxInspections">5</int>
    <int name="minQueryLength">4</int>
    <float name="maxQueryFrequency">0.1</float>
    <float name="thresholdTokenFrequency">.0000001</float>
  </lst>
</searchComponent>

and, in my request handler:

<bool name="spellcheck">true</bool>
<str name="spellcheck.dictionary">default</str>
<str name="spellcheck.extendedResults">false</str>
<str name="spellcheck.count">5</str>
<str name="spellcheck.alternativeTermCount">2</str>
<str name="spellcheck.maxResultsForSuggest">5</str>
<str name="spellcheck.collate">true</str>
<str name="spellcheck.collateExtendedResults">true</str>
<str name="spellcheck.maxCollationTries">5</str>
<str name="spellcheck.maxCollations">3</str>

_spellcheck_ is a CopyField (source="*"), indexed as text_general which is defined as:

<fieldType name="text_general" class="solr.TextField" >
  <analyzer type="index">
    <charFilter class="solr.HTMLStripCharFilterFactory" />
    <tokenizer class="solr.ClassicTokenizerFactory" />
    <filter class="solr.ClassicFilterFactory" />
    <filter class="solr.ASCIIFoldingFilterFactory" />
    <filter class="solr.LowerCaseFilterFactory" />
    <filter class="solr.TrimFilterFactory" />
    <filter class="solr.HyphenatedWordsFilterFactory" />
  </analyzer>
  <analyzer type="query">
    <charFilter class="solr.HTMLStripCharFilterFactory" />
    <tokenizer class="solr.ClassicTokenizerFactory" />
    <filter class="solr.ClassicFilterFactory" />
    <filter class="solr.ASCIIFoldingFilterFactory" />
    <filter class="solr.LowerCaseFilterFactory" />
    <filter class="solr.TrimFilterFactory" />
    <filter class="solr.HyphenatedWordsFilterFactory" />
  </analyzer>
</fieldType>

Any advice?

1

There are 1 answers

0
picci On

After some more work I found out the responsible to be the maxResultForSuggest parameter. The default value of 5 was not adequate to the size of my data set, setting it to 100 in my search handler solved my problem:

<str name="spellcheck.maxResultsForSuggest">100</str>

Hope this will help somebody.