I have several documents, that have a title:
- -> "Just some Word 13 from year 2015"
- -> "Just some Word 13 from year 2011"
- -> "Just some Word 13 from year 2012"
- -> "Just some Word 13 from year 2014"
- -> "Just some Word 13 from year 2013"
When searching for 13 i'm expecting number 5 to be the first result because 13 is exists twice.
Field is multiValued="true".
My fieldtype for indexing looks like this:
<analyzer type="index">
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="[(")(,:;!?)]" replacement=""/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.ReverseStringFilterFactory"/>
<filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="30" side="front"/>
<filter class="solr.ReverseStringFilterFactory"/>
</analyzer>
solr copyfield directive (indexing with and without EdgeNGramFilterFactory) was the solution to boost direct hits.