Is it possible to trigger a spark job via oozie when a folder's size reaches a certain threshold?

99 views Asked by At

For instance, if a folder reaches 100 MB then a spark job should be triggered. I read about the dirSize hdfs el function in oozie, but I'm not sure how to use it. Does it trigger the job when the folder reaches 100 MB, or does it have to be checked periodically in, let's say, every 2 minutes?

1

There are 1 answers

0
Naveenchandra Patil On

1 option for you is to run a oozie coordinator periodically (say for every 2min) to check on the file size, if it attains the specified limit you can trigger the spark job.