Same error code repeats multiple time while mpiexec

90 views Asked by At

I was trying mpiexec command, and it returned some sigsev error code. However, problem is not about why the error occured, but how error is shown.

When we look at error code below,

[songyi719-thinkpad-x1-extreme-2nd:172415] *** Process received signal ***
[songyi719-thinkpad-x1-extreme-2nd:172415] Signal: Segmentation fault (11)
[songyi719-thinkpad-x1-extreme-2nd:172415] Signal code: Address not mapped (1)
[songyi719-thinkpad-x1-extreme-2nd:172415] Failing at address: 0x440000e8
[songyi719-thinkpad-x1-extreme-2nd:172412] *** Process received signal ***
[songyi719-thinkpad-x1-extreme-2nd:172412] Signal: Segmentation fault (11)
[songyi719-thinkpad-x1-extreme-2nd:172412] Signal code: Address not mapped (1)
[songyi719-thinkpad-x1-extreme-2nd:172412] Failing at address: 0x440000e8
[songyi719-thinkpad-x1-extreme-2nd:172413] *** Process received signal ***
[songyi719-thinkpad-x1-extreme-2nd:172413] Signal: Segmentation fault (11)
[songyi719-thinkpad-x1-extreme-2nd:172413] Signal code: Address not mapped (1)
[songyi719-thinkpad-x1-extreme-2nd:172413] Failing at address: 0x440000e8
[songyi719-thinkpad-x1-extreme-2nd:172415] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x153c0)[0x7f0c3a59e3c0]
[songyi719-thinkpad-x1-extreme-2nd:172415] [ 1] /usr/local/lib/libmpi.so.40(MPI_Comm_rank+0x3b)[0x7f0c3a78771b]
[songyi719-thinkpad-x1-extreme-2nd:172415] [ 2] ./data(+0x3a432)[0x562c1fab5432]
[songyi719-thinkpad-x1-extreme-2nd:172415] [ 3] ./data(+0x98d9)[0x562c1fa848d9]
[songyi719-thinkpad-x1-extreme-2nd:172415] [ 4] [songyi719-thinkpad-x1-extreme-2nd:172413] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x153c0)[0x7fe5dd1ec3c0]
[songyi719-thinkpad-x1-extreme-2nd:172413] [ 1] /usr/local/lib/libmpi.so.40(MPI_Comm_rank+0x3b)[0x7fe5dd3d571b]
[songyi719-thinkpad-x1-extreme-2nd:172413] [songyi719-thinkpad-x1-extreme-2nd:172412] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x153c0)[0x7f021418a3c0]
[songyi719-thinkpad-x1-extreme-2nd:172412] [ 1] /usr/local/lib/libmpi.so.40(MPI_Comm_rank+0x3b)[0x7f021437371b]
[songyi719-thinkpad-x1-extreme-2nd:172412] [ 2] [songyi719-thinkpad-x1-extreme-2nd:172414] *** Process received signal ***
[songyi719-thinkpad-x1-extreme-2nd:172414] Signal: Segmentation fault (11)
[songyi719-thinkpad-x1-extreme-2nd:172414] Signal code: Address not mapped (1)
[songyi719-thinkpad-x1-extreme-2nd:172414] Failing at address: 0x440000e8
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7f0c3a3be0b3]
[songyi719-thinkpad-x1-extreme-2nd:172415] [ 5] ./data(+0xa33e)[0x562c1fa8533e]
[songyi719-thinkpad-x1-extreme-2nd:172415] *** End of error message ***
[songyi719-thinkpad-x1-extreme-2nd:172414] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x153c0)[0x7fc68e9043c0]
[songyi719-thinkpad-x1-extreme-2nd:172414] [ 1] /usr/local/lib/libmpi.so.40(MPI_Comm_rank+0x3b)[0x7fc68eaed71b]
[songyi719-thinkpad-x1-extreme-2nd:172414] [ 2] ./data(+0x3a432)[0x55e7f5786432]
[songyi719-thinkpad-x1-extreme-2nd:172414] [ 3] ./data(+0x98d9)[0x55e7f57558d9]
[songyi719-thinkpad-x1-extreme-2nd:172414] [ 4] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7fc68e7240b3]
[songyi719-thinkpad-x1-extreme-2nd:172414] [ 5] ./data(+0xa33e)[0x55e7f575633e]
[songyi719-thinkpad-x1-extreme-2nd:172414] *** End of error message ***
[ 2] ./data(+0x3a432)[0x560705a04432]
[songyi719-thinkpad-x1-extreme-2nd:172413] [ 3] ./data(+0x98d9)[0x5607059d38d9]
[songyi719-thinkpad-x1-extreme-2nd:172413] [ 4] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7fe5dd00c0b3]
[songyi719-thinkpad-x1-extreme-2nd:172413] [ 5] ./data(+0xa33e)[0x5607059d433e]
[songyi719-thinkpad-x1-extreme-2nd:172413] *** End of error message ***
./data(+0x3a432)[0x559eacf7a432]
[songyi719-thinkpad-x1-extreme-2nd:172412] [ 3] ./data(+0x98d9)[0x559eacf498d9]
[songyi719-thinkpad-x1-extreme-2nd:172412] [ 4] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7f0213faa0b3]
[songyi719-thinkpad-x1-extreme-2nd:172412] [ 5] ./data(+0xa33e)[0x559eacf4a33e]
[songyi719-thinkpad-x1-extreme-2nd:172412] *** End of error message ***
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpiexec noticed that process rank 3 with PID 0 on node songyi719-thinkpad-x1-extreme-2nd exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------

As you can see, same error code is mixed and repeated 4 times. I deleted and re-installed openmpi, but still error repeats 4 times.

How can this happen? How can I change this error to one non-repeated simple error code?

0

There are 0 answers