SIGPROF kills my server when using google perftools

1.2k views Asked by At

I have a multithreaded server process, written in C/C++ that I am trying to profile with Google perftools. However when I run the process with perftools, pretty soon my server stops with a "syscall interrupted" error, that I think is being caused by the incoming SIGPROF. (The actual system call that is being interrupted is deep inside my call to zmq_recv, but I don't think it's really important which one it is.)

Is this expected behavior? Should I be explicitly handling this case somehow? Or is something going wrong here?

1

There are 1 answers

2
TheCodeArtist On BEST ANSWER

From the zeroMQ documentation for zmq_recv() we can expect it to return EINTR if a signal is received when it is in progress.

Generating signals right in the middle of the zmq_recv() call would be a tough task for any test. Luckily gperftools generating a ton of SIGPROFs has uncovered this subtle "bug" in your code.

This must be gracefully handled in your code as the zeroMQ framework is gracefully ceding control. The retry logic could be as simple as modifying the existing call :

    /* Block until a message is available to be received from socket */
    rc = zmq_recv (socket, &part, 0);

with the new one(with retry logic) as follows :

   /* Block until a message is available to be received from socket
    * Keep retrying if interrupted by any signal
    */
    do {
        rc = zmq_recv (socket, &part, 0);
    } while(errno == EINTR);

Also install a signal handler function in your program. One can simply ignore the interruption due to SIGPROF and keep retrying.

Finally, you may want to handle specific signals and act accordingly. For example, gracefully terminating your program even if user presses CTRL+C while your program is waiting on zmq_recv().

   /* Block until a message is available to be received from socket
    * If interrupted by any signal,
    * - in handler-code: Check for signal number and update status accordingly.
    * - in regular-code: Check for status and retry/exit as appropriate
    */

    do {
        rc = zmq_recv (socket, &part, 0);
    } while(errno == EINTR && status == RETRY);

In the interest of keeping code "clean", you will be better served by using the above snippet to write your own static inline function wrapper around zmq_recv() which you can call in your program.

Regarding the decision to make zmq_recv() return EINTR upon receiving signals, you might want to checkout this article that talks about the worse is better philosophy behind such a design that is simpler from an implementation point-of-view.


UPDATE: Documented code to handle signals in the context of zmq_recv() is available at git.lucina.net/zeromq-examples.git/tree/zmq-camera.c. It is along the same lines as explained above, but it looks well tested and ready to use with detailed comments graciously sprinkled all over (yayyy zeromq!).