Is Work Stealing always the most appropriate user-level thread scheduling algorithm?

4.4k views Asked by At

I've been investigating different scheduling algorithms for a thread pool I am implementing. Due to the nature of the problem I am solving I can assume that the tasks being run in parallel are independent and do not spawn any new tasks. The tasks can be of varying sizes.

I went immediately for the most popular scheduling algorithm "work stealing" using lock-free deques for the local job queues, and I am relatively happy with this approach. However I'm wondering whether there are any common cases where work-stealing is not the best approach.

For this particular problem I have a good estimate of the size of each individual task. Work-stealing does not make use of this information and I'm wondering if there is any scheduler which will give better load-balancing than work-stealing with this information (obviously with the same efficiency).

NB. This question ties up with a previous question.

2

There are 2 answers

3
Georg Schölly On

I'd distribute the tasks upfront. With the information of their estimated running time you can distribute them into individual queues, for each thread one.

Distributing the tasks is basically the knapsack problem, each queue should take the same amount of time.

You should add some logic to modify the queues while they run. For example a re-distribution should occur after the estimated running time differs by a certain amount from the real running time.

0
guilhermemtr On

It is true that work-stealing scheduler does not use that information, but it is because it does not depend on it to provide the theoretical limits it does (for example, the memory it uses, the expected total communication among workers and also the expected time to execute a fully strict computation as you can read here: http://supertech.csail.mit.edu/papers/steal.pdf)

One interesting paper (that I hope you can access: http://dl.acm.org/citation.cfm?id=2442538) actually uses bounded execution times to provide formal proofs (that try to be as close to the original work-stealing bounds as possible).

And yes, there are cases in which work-stealing does not perform optimally (for example, unbalanced tree searches and other particular cases). But for those cases, optimizations have been made (for example by allowing the steal of half of the victim's deque, instead of taking only one task: http://dl.acm.org/citation.cfm?id=571876).