Advantage of using Spring cloud data flow instead of spring batch

7.8k views Asked by At

We need to write an application to read a flat file every day and write into a database table. We are planning to use Spring Batch to do this job.

The limitation and addition we are looking for are

1.The application itself would run in a single VM. It would never be deployed in more than one VM at anytime.

2.And we might have other files in future to follow the same pattern.

In this scenario, does using Spring Data Flow provide any features or advantage over spring batch?

1

There are 1 answers

0
Sabby Anandan On

I tried to summarize the general feature capabilities and the simplification that Spring Cloud Data Flow (SCDF) offers in this SO thread - perhaps this could be useful.

In your case,

The application itself would run in a single VM. It would never be deployed in more than one VM at anytime.

Not sure if this is a question or a requirement. I'm going to assume you're wondering how to scale-out your batch-job operation.

If you have a remote partitioned batch-job, depending on the number of workers that you have configured, each one of them is run in a separate process/container and the master step defined in your batch-job coordinates the workers and the data partitions. This would be an example of parallelized operation - here's a sample.

And we might have other files in future to follow the same pattern

Great. Once you have your batch-job defined and registered in SCDF, you can launch/re-launch it anytime. You'd use SCDF's REST-APIs, Shell, or Dashboard to do so.

Depending on the runtime platform where you're running SCDF + batch-job, you could take advantage of the platform specific scheduler to schedule the batch-job via the REST-APIs exposed in SCDF.