We are orchestrating data pipeline with AWS steps and we do need to run EMR jobs in parallel. I have tried using Map state and it works as expected. The only problem with Map is that in case one step fails , it cancels all the other steps as well. To overcome this issue , I am thinking if we can create an array of steps and pass it dynamically to Branches in parallel state but I have not been able to do it as it is not accepting strings. Is there a workaround for this or can we only hard code branches in Parallel state? Can States.Array() in someway be helpful in this situation?
AWS steps parallel state to orchestrate EMR jobs
283 views Asked by manu At
2
There are 2 answers
0
On
Just for someone who is trying to look for a solution to the stated problem. As suggested by Pooya, I did use catch block inside task within the Map rather than keeping it at map level.The state machine looks like this
Wrap the inner state machine in a one-branch parallel state and add error/retry policies to it. Basically, you want to catch all errors and ensure that the iteration always succeeds.