generate core file with gdb

1.3k views Asked by At

I used gdb generate-core-file to generate a core file for a process (mongod), but the process mmap many data files and the Res of this process is up to 36.1G.
after the core file consumed 34G space, no more space are available on disk, so I got:

warning: writing note section (No space left on device) Saved corefile core.12038

I want to know if all the mmap data will be dump to core file? What can I do if I only want to see some local variables?

background: we had an issue on production, and the binary on production don't have symbol info in it. so I want to generate core file and do some analyse offline.

2

There are 2 answers

2
Employed Russian On BEST ANSWER

I want to know if all the mmap data will be dump to core file?

Usually the kernel only dumps writable mmaps, but not read-only ones. However, this is configurable: see core(5) man page (the "Controlling which mappings are written to the core dump" part).

background: we had an issue on production, and the binary on production don't have symbol info in it.

The "standard" approach is to debug such binaries remotely with gdbserver and connect to it with gdb that does have access to full-debug binary.

1
AudioBubble On

What can I do if I only want to see some local variables? background: we had an issue on production, and the binary on production don't have symbol info in it.

You have not mention OS in your question so if you on Linux
1) Install on a production server your program with debugging information
2) If you cannot do this analyze assembler code of a function you are intrested in and get values of local variables from assembler

And then use SystemTap to trace your program.

Let me illustrate both approaches with a simple example. First, C++ program to analyze:

>cat main.cpp
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int f(int arg)
{
    int a = arg+1;
    int b = arg+2;
    int c = a + b;
    printf ("printf in program: f, c: %d\n", c);
    return c;
}

int main(int argc, char *argv[])
{
   printf ("f: %p\n", &f);
   int sum = 0;
   while (true) {
     for (int i= atoi(argv[1]); i < atoi(argv[2]); ++i) {
       sum += f(i);
     }
     sleep(5);
   }

   printf("Sum: %d\n", sum);
   return 0;
}

So I want to get value of the local variable "c" in the function f().

1) If symbol information is available

>cat measure_f.stp
probe process("a.out").statement("*@main.cpp:10")
{
  printf("SystemTap, time: %s, the local variable c :%d\n", ctime(gettimeofday_s()), $c)
}

>sudo stap measure_f.stp -c "./a.out 21 23"
f: 0x400634
printf in program: f, c: 45
printf in program: f, c: 47
SystemTap, time: Fri Dec 27 12:59:31 2013, the local variable c :45
SystemTap, time: Fri Dec 27 12:59:31 2013, the local variable c :47
printf in program: f, c: 45
printf in program: f, c: 47
SystemTap, time: Fri Dec 27 12:59:36 2013, the local variable c :45
SystemTap, time: Fri Dec 27 12:59:36 2013, the local variable c :47

1) If symbol information is not available then use assembler

First disassemble your function and find what address you will monitor

(gdb) disassemble /m f
Dump of assembler code for function f(int):
6       {
   0x0000000000400634 <+0>:     push   %rbp
   0x0000000000400635 <+1>:     mov    %rsp,%rbp
   0x0000000000400638 <+4>:     sub    $0x20,%rsp
   0x000000000040063c <+8>:     mov    %edi,-0x14(%rbp)

7           int a = arg+1;
   0x000000000040063f <+11>:    mov    -0x14(%rbp),%eax
   0x0000000000400642 <+14>:    add    $0x1,%eax
   0x0000000000400645 <+17>:    mov    %eax,-0xc(%rbp)

8           int b = arg+2;
   0x0000000000400648 <+20>:    mov    -0x14(%rbp),%eax
   0x000000000040064b <+23>:    add    $0x2,%eax
   0x000000000040064e <+26>:    mov    %eax,-0x8(%rbp)

9           int c = a + b;
   0x0000000000400651 <+29>:    mov    -0x8(%rbp),%eax
   0x0000000000400654 <+32>:    mov    -0xc(%rbp),%edx
   0x0000000000400657 <+35>:    lea    (%rdx,%rax,1),%eax
   0x000000000040065a <+38>:    mov    %eax,-0x4(%rbp)

10          printf ("printf in program: f, c: %d\n", c);
   0x000000000040065d <+41>:    mov    -0x4(%rbp),%eax
   0x0000000000400660 <+44>:    mov    %eax,%esi
   0x0000000000400662 <+46>:    mov    $0x4007f8,%edi
   0x0000000000400667 <+51>:    mov    $0x0,%eax
   0x000000000040066c <+56>:    callq  0x4004f8 <printf@plt>

11          return c;
   0x0000000000400671 <+61>:    mov    -0x4(%rbp),%eax

12      }
   0x0000000000400674 <+64>:    leaveq
   0x0000000000400675 <+65>:    retq

As you can see in order to get the local variable c it is necessary on 0x000000000040065a to get the register %eax

> cat measure_f_2.stp
probe begin
{
  printf("Monitoring process %d\n", $1)
}

probe process($1).statement(0x000000000040065a).absolute
{ 
  printf("SystemTap (2), time: %s, the local variable c (rax):%d\n", ctime(gettimeofday_s()), register("rax"))
}

So I started "./a.out 21 23" and then run my SystemTap script

>sudo stap measure_f_2.stp 11564
Monitoring process 11564
SystemTap (2), time: Fri Dec 27 13:15:09 2013, the local variable c (rax):45
SystemTap (2), time: Fri Dec 27 13:15:09 2013, the local variable c (rax):47
SystemTap (2), time: Fri Dec 27 13:15:14 2013, the local variable c (rax):45
SystemTap (2), time: Fri Dec 27 13:15:14 2013, the local variable c (rax):47