Solrcloud & data import handler

2k views Asked by At

I am planning to upgrade Solr from single instance option to cloud option. Currently I have 5 cores and each one is configured with data import handler. I have deployed web application along with solr.war inside tomcat folder which will trigger full imports & delta-imports periodically according to my project needs.

Now, I am planning to create 2 shards for this application keeping half of my 5 cores data into each shard.I am not to understand how DIH will work in SolrCloud?

  • Is it fine if I start full-indexing from both shards?
  • Or I need to do full indexing from only one shard?

Architecture will look like below enter image description here

2

There are 2 answers

5
Calin Grecu On

It all depends on how you create your solr cloud: using composite id or implicit routing. Using composite id routing will take care of spreading the documents across all available shards. You can initiate the import from any solr cloud node. In the end the cloud environment will contain the imported document indices spread across all shards. If you use implicit routing you have control where to keep each document index. You do not have to use the DIH. Alternatively you can write a small app that uses the solr client to populate the index, which gives you more control.

0
Javadroider On

After lots of googling and reading I finally decided to implement DIH as follows. Please let me know your comments if you feel there will be issues with this architecture.

enter image description here