Determining symbol addresses using binutils/readelf

7.2k views Asked by At

I am working on a project where our verification test scripts need to locate symbol addresses within the build of software being tested. This might be used for setting breakpoints or reading static data from memory. What I am after is to create a map file containing symbol names, base address in memory, and size. Our build outputs an ELF file which has the information I want. I've been trying to use the readelf, nm, and objdump tools to try and to gain the symbol addresses I need.

I originally tried readelf -s file.elf and that seemed to access some symbols, particularly those which were written in assembler. However, many of the symbols that I wanted were not in there - specifically those that originated within our Ada code.

I used readelf --debug-dump file.elf to dump all debug information. From that I do see all symbols, including those that were in the Ada code. However, the format seems to be in the DWARF format. Does anyone know why these symbols would not be output by readelf when I ask it to list the symbolic information? Perhaps there is simply an option I am missing.

Now I could go to the trouble of writing a custom DWARF parser to get the information but if I can get it using one of the Binutils (nm, readelf, objdump) then I'd really like prefer a standard solution.

1

There are 1 answers

1
jclin On

DWARF is the debug information and tries to reflect the relation of the original source code. Taking following code as an example

static int one() {
  // something
  return 1;
}
int main(int ac, char **av) {
  return one();
}

After you compile it using gcc -O3 -g, the static function one will be inlined into main. So when you use readelf -s, you will never see the symbol one. However, when you use readelf --debug-dump, you can see one is a function which is inlined.

So, in this example, compiler does not prohibit you use optimization with -g, so you can still debug the executable. In that example, even the function is optimized and inlined, gdb still can use DWARF information to know the function and source/line from current code block inside inlined function.

Above is just a case of compiler optimization. There might be plenty of reasons that could lead to mismatch symbols address between readelf -s and DWARF.