SOLR DIH cluster environment

138 views Asked by At

I have the solr cloud environment configured, up and running, no issues at all. But now I need to run a delta import in a loop.. every time this import process finished start another one.

Considerations:

  1. Same DIH configuration in all nodes.
  2. The 3 solr nodes are running behind a load balancer (the command can be executed on any of the nodes)
  3. I don't want to execute the importer in a second node if it's running already in one node.
  4. I would like to run the DIH as soon as the last execution finished, right away.
  5. if one node goes down during a import, I would like to be able to say.. this is taking too long.. let's just start another import process.(if there is a way to identify the node where the process was running when it went down, so I can check it and save that information to find out the reasons.. it will be great )
  6. I have so many events going on on the database every minute, I really need all these events(DB records) on Solr (documents up to date)

Options and thoughts

  1. I'm thinking in using JBoss EAP 5.1 to run the external app with the TimerService, I have got a cluster and I can ensure this will run forever asking for status and restarting the DIH process in a loop.
  2. I was taking a look and testing the DHI Event lister

    <dataConfig>
      <document onImportEnd="com.me.MyNotificationService">
     ....
      </document>
     </dataConfig>
    

com.me.MyNotificationService this can let me know when the process finished, but I still don't know how to connect it to the "Run solr import app" since this will be on a library running out of my JBoss AS container(again if the Solr node goes down I lose the notification as well ).

  1. If there is a way to ensure this loop won't be broke. If all this is managed by the Solr cluster(and take care of situations like when a node goes down in the middle of an import) I will forget about that external "Run solr import app", but I really don't think it's possible.

  2. It can be really useful the ability to say to the Solr cluster execute this import process on this node (let's say node 2) and then let me know when it finished or give me a way to ask for status (on that specific node 2 even if I'm asking this to the node 1, because of the load balancer )

Any recommendation and thoughts will be more than welcome.

Thanks.

0

There are 0 answers