Enviornment ==> solr - solr-8.9.0, java version "11.0.12" 2021-07-20 LTS
Following .csv file is indexed in solr
books_id,cat,name
0553573403,book,Game Thrones Clash
0553573404,book,GameThrones Clash
0553573405,book,GameThronesClash
0553573406,book,GameThronesClas
Schema defined in managed-schema as follows
<field name="books_id" type="plong" multiValued="false" indexed="false" stored="true"/>
<field name="cat" type="string" multiValued="false" indexed="false" stored="true"/>
<field name="name" type="text_general" multiValued="false" indexed="true" required="true" stored="true"/>
<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100" multiValued="false">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.ShingleFilterFactory" minShingleSize="2" maxShingleSize="3"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
I expect that if i query the book 'GameThronesClash', it should give me other three books. so shingleFilterFactory has been configurd with minShingleSize="2" maxShingleSize="3".
I understand that construct shingles from token stream.
In: "Game Thrones Clash"
Tokenizer to Filter: "Game"(1), "Thrones"(2), "Clash"(3)
Out: "Game"(1), "GameThrones"(1), "GameThronesClash"(1), "Thrones"(2), "ThronesClash"(2),"Clash"(3)
But following query
curl -G http://localhost:8983/solr/shingleConcatenationFuzzyCore/select --data-urlencode "q=(name:'GameThronesClash~')"
{
"responseHeader":{
"status":0,
"QTime":15,
"params":{
"q":"(name:'GameThronesClash~')"}},
"response":{"numFound":3,"start":0,"numFoundExact":true,"docs":[
{
"books_id":0553573404,
"cat":"book",
"name":"GameThrones Clash",
"id":"22674fc1-9fc7-4e1b-8d09-231acf39bc25",
"_version_":1743512855396745216},
{
"books_id":0553573405,
"cat":"book",
"name":"GamethronesClash",
"id":"e82a0dee-a3fb-483e-806b-e667490536f4",
"_version_":1743512855375773696},
{
"books_id":0553573406,
"cat":"book",
"name":"GameThronesclas",
"id":"bf240788-81cd-4a51-b62d-5aba778e1dee",
"_version_":1743512855376822272}
}}
But why is not giving books having Id : "books_id":0553573403,("name":"Game Thrones Clash"). What to change in the query to retrieve book having name as "name":"Game Thrones Clash"
"Analysis" page under Solr's Admin page for specific field 'name' is as mentioned below -
Field value (Index) :==>name:'Game Thrones Clash'

Field value (Index) :==>name:'GameThronesClash'


