C++ debugging: Terminated with SIGABRT

8.8k views Asked by At

I am trying to write a program, in C++, which runs on a cluster of machines, and all machines are talking to each other over TCP sockets. Program crashes randomly at one of the machines. I did an analysis of core-dump with gdb. Following are the output:

$ gdb executable dump

  Core was generated by `/home/user/experiments/files/executable 2 /home/user/'.
  Program terminated with signal SIGABRT, Aborted.
  0 0x00007fb76a084c37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
  56    ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.

  (gdb) backtrace
  0 0x00007fb76a084c37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
  1 0x00007fb76a088028 in __GI_abort () at abort.c:89
  2 0x00007fb76a0c12a4 in __libc_message (do_abort=do_abort@entry=2, fmt=fmt@entry=0x7fb76a1cd113 "*** %s ***: %s terminated\n") at ../sysdeps/posix/libc_fatal.c:175
  3 0x00007fb76a158bbc in __GI___fortify_fail (msg=<optimized out>, msg@entry=0x7fb76a1cd0aa "buffer overflow detected") at fortify_fail.c:38
  4 0x00007fb76a157a90 in __GI___chk_fail () at chk_fail.c:28
  5 0x00007fb76a158b07 in __fdelt_chk (d=<optimized out>) at fdelt_chk.c:25
  6 0x000000000040a918 in LocalSenderPort::run() ()
  7 0x000000000040ae70 in LocalSenderPort::LocalSenderPort(unsigned int, std::string, std::vector<std::string, std::allocator<std::string> >, char*) ()
  8 0x00000000004033d5 in main ()

Any suggestions for what should I look? How should I proceed? Any help is really appreciated.

I am not sharing code right now, as its a large code spread across files. But I can share if needed.

1

There are 1 answers

4
Employed Russian On

This error: __fdelt_chk (d=<optimized out>) at fdelt_chk.c:25 means that your program violated precondition of one of the FD_* macros.

The source of fdelt_chk is quite simple, and there are only two conditions under which it fails: you pass in negative file descriptor, or you pass in a file descriptor greater than 1023.

In this day and age, using select and/or FD_SET in any program that can have more than 1024 simultaneous connections (which Linux easily allows) can only end in tears. Use epoll instead.