Background: I'm writing network traffic processing kernel module. I'm getting packets using netfilter hooks. All filtering is done inside hook function, but I don't want to do packet processing here. So solution is tasklets or workqueues. I know the difference between them, I can use both, but I have some problems and I need an advice.
Tasklets solution. Preferrable. I can create and start tasklet for each packet, but who will delete this tasklet? Tasklet function? I don't think its a good idea - to dealloc tasklet while it is executing. Create global pool of tasklets? Well, since there can't be 2 executing tasklets on one processor, the pool size will be the number of processors. But how to find out when tasklet is available for new use? There are only two states: shed and run, but there is no "done" state. Ok, I probably can wrap tasklet with some struct with flag. But wouldn't that all be too much overkill?
Workqueue solution. Same problem: who will delete work? Same "solution" as for tasklets?
Workqueue solution 2. Just create permanent work due module loading, save packets to some queue and process them inside the work. May be two works and two queues: incoming and outgoing. But I'm afraid that with that solution I will use only one (or two) processors since looks like work can't be performed on few processors simultaneously.
Any other solutions?
One can use high-priority(
WQ_HIGH_PRI
), unbound(WQ_UNBOUND
) workqueues and stick with option3 listed in the question.WQ_HIGH_PRI
guarantees that the processing is initiated ASAP.WQ_UNBOUND
eliminates the single-CPU bottleneck as the scheduler assigns the work to any available CPU immediately.