I have a parallel (MPI) c/c++ program that from time to time leads to an error under certain conditions. Once the error occurs, a message is printed and the program exits; I'd like to set a break point to see the stack and more detail regarding what caused the error. I'm using TotalView to debug things, and I'd like it to stop at a break point in my error routine. I'd like it to always, automatically setup this break point. Is there a way to do this?
I'm looking into using signal.h and raise, but it's not clear yet how TotalView responds.
Looking at this question, How do you stop in TotalView after an MPI Error?, it appears that C++ exception handling, i.e. throw(), will automatically cause TotalView to stop. What's the right way to do this in C?
In TotalView, the File > Signals menu option opens this window:
This is to control the default behavior in response to signal calls. SIGTRAP and SIGSTOP are reserved, and it seems TotalView treats these differently. That is
raise(SIGSTOP)did not stop as expected in TotalView.This program:
produces this response:
And the program state is listed as "Exited or Never Created". When SIGTRAP is replaced with SIGSTOP, the same result occurs, but without the "Unexpected..." message.
As is shown in the image above, SIGINT, SIGTSTP, SIGTTIN and SIGTTOU by default lead TotalView to stop, as if there were a break point.
In a similar fashion to the answer provided by Mooing Duck (Totalview: is there a way to hardcode a break point?), these raise() calls can be optionally made if you are trying to debug:
This is just one of many ways to probably get the desired effect of a hard coded break point.