performance issue in getting millions of record from database and processing in ERP in mule esb

1.1k views Asked by At

We are trying to fetch millions of record from database and processing in ERP system per day and we are facing performance issue, is there any solution regarding this in Community?

What is the best way to process the records in mule? So should we use batch or is there any alternate to it? And if we use batch or any other solution, how can we use it so as not to face any performance issue?

2

There are 2 answers

0
Julie Russell On

Batch sounds like what you want to do. For each batch step Mule creates a batch job instance and each instance contains a persistent queue with the batched records. However, it does a deep copy of the MuleEvent containing the flow variables, flow construct, message, processing time, session and exchange pattern so beware, make sure you keep a light footprint before going into your batch job. If you have to set the payload with millions of records to flow variables to do some manipulation, make sure you delete them before you start executing the batch. It will load these batch steps in memory and execute them concurrently so the amount of memory you will need will be the size of the batch job instance (in particular the MuleEvent) by the number of batch steps.

0
Charles On

Since we don't have details on your specific situation, here are some general ideas. You will definitely need to do performance testing when dealing with large data sets to make sure your flow design is performing well.

Just to clarify, I'm giving options below that show streaming, which are slightly less performant, but will allow you to process large datasets. If you can handle the dataset in memory and you want faster processing, then turn off streaming.

  • Test your db queries outside of mule to make sure they are performant and tables are properly indexed.
  • Use streaming db connection. Tweak chunk size for performance testing. (Using this with batch scope is a good combo)
  • If using on-premise runtime, do performance tuning.
  • Use batch scope (enterprise edition)