How can I set a lower priority to a Luigi task that is required by a high priority task?

170 views Asked by At

I want to set different priorities to different tasks in my pipeline, but once I set TaskA with some priority, all its dependencies with lower priority are updated to this new priority.

For example, if I have these 2 tasks:

class TaskA(luigi.Task):
    priority = 100
    some_parameter = luigi.IntParameter()

    def requires(self):
        return TaskB(some_parameter = some_parameter)

    def run(self):
        ...

class TaskB(luigi.Task):
    priority = 40
    some_parameter = luigi.IntParameter()

    def run(self):
        ...

Whenever I run TaskA, TaskB will be set with priority 100. I want to schedule several instances of TaskA in parallel, but I dont want the scheduler deciding to run only instances of TaskB first and then instances of TaskA after, because TaskB generates files that take a lot of disk space and TaskA removes them. If I could schedule TaskB with lower priority, then TaskA would be scheduled whenever TaskB finishes, and the scheduler would not have to decide between one instance of TaskA and one instance of TaskB with the same priority.

The reason this is not a single task is beacuse in reality, TaskB is several different tasks and I want to take advantage of the parallelism before TaskA runs.

0

There are 0 answers