Work stealing and deques

1.8k views Asked by At

Why do we need a deque for work-stealing? (e.g. in Cilk) The owner works on the top and the thief steals from the bottom. Why is it useful?

We might have multiple thieves stealing from the bottom. So, don't we need a lock anyway? I have read somewhere that larger jobs (for example created in a tree) are added to the bottom. So, stealing from bottom is more efficient (less communication, as the thieves become more busy by stealing them). Is that it?

3

There are 3 answers

0
Barry Tannenbaum - Intel On

The details of the THE protocol are described in section 5 of "The Implementation of the Cilk-5 Multithreaded Language" which is available from MIT: http://supertech.csail.mit.edu/papers/cilk5.pdf

2
shams On

You do not need a deque for work-stealing. It is possible (and people have done it) to use a concurrent data structure to store the pool of tasks. But the problem is that push/pop operations from workers and steal requests from thieves all have to be synchronized.

Since steals are expected to be relatively rare events, it is possible to design a data structure such that synchonization is performed mianly during steal attempts and even then when it is likely that there might be a conflict in popping an item from the data structure. This is exactly why deques were used in Cilk - to minimize synchronization. Workers treat their own deques as a stack, pushing and popping threads from the bottom, but treat the deque of another busy worker as a queue, stealing threads only from the top, whenever they have no local threads to execute. Since steal operation are synchronized, it is okay for multiple thieves to attempt to steal from the same victim.

Larger jobs being added to the bottom is common in divide-and-conquer style algorithms, but not all. There is a wide variety of strategies in place for what to do during stealing. Steal one task, few tasks, half the tasks, and so on. Each of these variants work well for some applications and not so well in others.

3
guilhermemtr On

Work stealing actually really needs a deque. In the original paper, they have proved the maximum used memory on a system with P processors. The limit is given by the maximum size of any stack times the number of processors. That is actually only possible by following the busy leaves theorem. Also, another important feature of work stealing is that: When a worker does a spawn, it immediately saves the spawner on the deque and starts working on the child. For more information regarding their proofs, please read their original paper, in which they explain all I am saying. http://supertech.csail.mit.edu/papers/steal.pdf

Concurrency control in the work stealing deque accesses are not related to the work stealing scheduler, and in fact, much research has been made towards removing the locks from the deque (by using lock free structures) and also to minimize as much as possible memory barriers. For example in this paper (that i am sorry if can not access, but you can read the abstract anyways to get the idea): http://dl.acm.org/citation.cfm?id=1073974 the authors create a new deque for improving the afore mentioned aspects.

The steals are made from the side that the worker is not working on for possibly several reasons: Since the deque acts as a stack for each worker (the owner of the deque) the "bigger" jobs should be on top of it (as you can understand by reading the paper). When I say bigger I want to mean that those are probably the ones that will have more computation to do. Also, another important aspect is that by doing so (stealing from the deque owner's opposite work side) reduces the contention as in some new deque's both a victim and a thief may be working at the same time on the same deque.