I already know that you can configure crawling to be resumable.
But is it possible to use resumable functionality to pause crawling process and then resume crawling later programmatically? E.g. I can gracefully shutdown crawling with shutdown method of the crawler and with resumable parameter set to true, then start again crawling.
Will it work this way, because primary purpose of resumable parameter is to handle accidental crashes of crawler. Is there any other or better way how to achieve this functionality with crawler4j?
If you set the parameter resumable to
true, theFrontieras well as theDocIdServerwill store their queues on the user-defined storage folder.This works either for a crash or for a programmatic shutdown. In both cases, the storage folder must be the same.
See also the related issue on the offical issue tracker