Clusters produced by Solr and Carrot2 Workbench not consistent

92 views Asked by At

I'm trying to tune clustering in Solr using Carrot2 Workbench. While Workbench is producing nice results Solr is refusing to do so and its clusters are very much different.

My flow:

  • Prepare set of doc ids and query on them alone (fq)
  • Tune and export XML config from Workbench
  • Restart Solr to make sure it's all picked up
  • Repeat the same query (I also made sure it's exactly the same as one from Wrokbench by checking Solr logs)
  • Compare clusters... And this is the moment I'm lost. They are completely different even is structure. Workbench produces longer more complex labels, while Solr labels are very simple.

I tried to tweak parameters both from XML and query, but with very little effect. However enough to see that configs are being picked up.

Another thing I checked was Carrto2 CLI tool. I exported data from Solr to XML and used CLI together with config I exported from Workbench to produce clusters and CLI is consistent with Workbench.

That leaves Solr being an odd one. I use Carrot2 v3.15.1 and Solr 7.2.1

What am I missing? Why Solr is producing different clusters from the same data and configuration?

0

There are 0 answers