Using JSR-352 batch job along with Java EE, I'm trying to process items on chunk from a source in partitions. On retriable exception I want to be able to return to a past checkpoint, so I could get items already read from the source.
The nature of the source is such that in parallel environment I cannot require the same chunk of items twice. The only feasible way to be able to get the exact same items when reading twice is by having to restart the whole job.
I need to write a generic ItemReader
which can manage sources of such a kind (so it may be reusable). This basically means that want to find nice and clear design/implementation of such a reader.
To achieve the required behavior of ItemReader
to process the source, what I currently do is getting the items in the beginning of the readItem()
if they have not been fetched for the current chunk, and then iterate one by one through them. In order to manage retriable exceptions I'm trying to use the checkpoint properties of the ItemReader
.
The problem I'm facing is that the behavior of checkpoints is such that they are loaded in open(...)
method, before readItem()
and saved only after the chunk has been successful. This results in a problem with saving all the items of the chunk into a valid checkpoint before I must actually retry the chunk in case of an retriable exception.
My question is there a way to make augment the behavior of checkpoints, so they are saved after the initial readItem()
, or do you happen to know any other nice and clear strategy, without the usage of additional listeners, userTransientData which would make the reader hard to integrate into other batch job steps with the same read behavior?