I use SnowballPorterFilterFactory for index and query analyzers. When i search for "profession" word. Solr successfully finds only articles that contains "profession", but i want "professional" "professionalism" ...
This is the current configuration on schema.xml
<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
<filter class="solr.ASCIIFoldingFilterFactory" />
<filter class="solr.SnowballPorterFilterFactory" language="French"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="0" generateNumberParts="0" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/>
<filter class="solr.ASCIIFoldingFilterFactory" />
<filter class="solr.SnowballPorterFilterFactory" language="French"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
</analyzer>
</fieldType>
What is happening is porter is over-stemming your query. When you search for
professionyour keyword gets stemmed down toprofess, whereasprofessionprofessionalandprofessionalismare all stored in the index asprofession.The only real way you are going to get around this is by adding another
fieldTypewhere you do not stem your query.Something like:
With a copyfield like:
<copyField source="your_text_field" dest="text_unstem_query_field"/>