Linux; How to debug spin_lock source

4.2k views Asked by At

tl;dr

There are various spin locks and mutex locks in the Kernel source file `net/packet/af_packet.c, I can read the source and see which function use locks and what order they are called in, but I can't tell if the locks ever spin or for how long, just from reading the code. I need to see what is actually happing at run time.

How can I trace the source of spin locks for a single application?

Background

I have written a basic multi-threaded network traffic generator and I would like to find out what is causing it to block/stall. perf top shows lots of (circa 25%) _raw_spin_lock_irqsave and native_queued_spin_lock_slowpath.

When I run the traffic generator with a single worker thread, pinned to a single CPU core, the transmit rate is about 6.5Gbps and the worker thread core running at circa 35% utilisation. The core that processes the NET_RX soft IRQ (which processes the clean up operation of transmitted traffic so don't be confused by the name!) is running at 100%, so it seems that the Kernel can't process anymore packets per second due to a bottleneck in the Tx path inside the Kernel (af_packet.c) "somewhere".

When I run the traffic generator application with a second worker thread, each thread being pinned to separate cores, each core is running at circa 35% utilisation. Two additional cores are now running at 100% for the NET_RX soft IRQs pinned to them (processing the transmit clean up action within the kernel from the two worker threads). So the application is using double the CPU utilisation and so are the NET_RX IRQs however, the transmit rate has only increased from 6.5Gbps to 7Gbps (a 100% increase in CPU utilisation but only a ~10% increase in transmitted traffic) and now in perf top I see up to 25% of _raw_spin_lock_irqsave and native_queued_spin_lock_slowpath so it looks to me like a spin lock or mutex lock is blocking in the Kernel (in net/packet/af_packet.c) and I'd like to narrow down where exactly (which lock exactly or which function call.

I have been trying to use SystemTap to find out. If one dares to probe for kernel.function("spin_lock") or raw_spin_lock/_raw_spin_lock/__raw_spin_lock etc you are presented with an insane amount of output that is difficult to sort or simply a full crash/system lock up. This is kind of obvious to expect really as many processes are using and polling for spin locks.

0

There are 0 answers