where/how does Apples GCC store DWARF inside an executable

10.4k views Asked by At

Where/how does Apples GCC store DWARF inside an executable?

I compiled a binary via gcc -gdwarf-2 (Apples GCC). However, neither objdump -g nor objdump -h does show me any debug information.

Also libbfd does not find any debug information. (I asked on the binutils-mailinglist about it here.)

I am able however to extract the debugging information via dsymutil (into a dSYM). libbfd is also able to read those debug info then.

4

There are 4 answers

6
Jason Molenda On BEST ANSWER

On Mac OS X there was a decision to have the linker ld not process all of the debug information when you link your program. The debug information is often 10xthe size of the program executable so having the linker process all of the debug info and include it in the executable binary was a serious detriment to link times. For iterative development - compile, link, compile, link, debug, compile link - this was a real hit.

Instead:

  • The compiler generates the DWARF debug information in the .s files, the assembler outputs it in the .o files
  • The linker includes a "debug map" in the executable binary which tells debug info users where all of the symbols were relocated during the link.

A consumer (doing .o-file debugging) loads the debug map from the executable and processes all of the DWARF in the .o files as-needed, remapping the symbols as per the debug map's instructions.

dsymutil can be thought of as a debug info linker. It does this same process -- read the debug map, load the DWARF from the .o files, relocate all the addresses -- and then outputs a single binary of all the DWARF at their final, linked addresses. This is the dSYM bundle.

Once you have a dSYM bundle, you've got plain old standard DWARF that any dwarf reading tool (which can deal with Mach-O binary files) can process.

There is an additional refinement that makes all of this work, the UUIDs included in Mach-O binaries. Every time the linker creates a binary, it emits a 128-bit UUID in the LC_UUID load command (v. otool -hlv or dwarfdump --uuid). This uniquely identifies that binary file. When dsymutil creates the dSYM, it includes that UUID. The debuggers will only associate a dSYM and an executable if they have matching UUIDs -- no dodgy file mod timestamps or anything like that.

We can also use the UUIDs to locate the dSYMs for binaries. They show up in crash reports, we include a Spotlight importer that you can use to search for them, e.g. mdfind "com_apple_xcode_dsym_uuids == E21A4165-29D5-35DC-D08D-368476F85EE1" if the dSYM is located in a Spotlight indexed location. You can even have a repository of dSYMs for your company and a program that can retrieve the correct dSYM given a UUID - maybe a little mysql database or something like that - so you run the debugger on a random executable and you instantly have all the debug info for that executable. There are some pretty neat things you can do with the UUIDs.

But anyway, to answer your original question: The unstripped binary has the debug map, the .o files have the DWARF, and when dsymutil is ran these are combined to create the dSYM bundle.

If you want to see the debug map entries, do nm -pa executable and they're all there. They are in the form of the old stabs nlist records - the linker already knew how to process stabs so it was easiest to use them - but you'll see how it works without much trouble, maybe refer to some stabs documentation if you're uncertain.

0
Timmmm On

It seems there are two ways for OSX to place debug information:

  1. In the .o object files used for compilation. The binary stores a reference to these files (by absolute path).

  2. In a separate bundle (directory) called .dSYM

If I compile with Apple's Clang using g++ -g main.cpp -o foo I get the bundle called foo.dSYM. However if I use CMake I get the debug info in the object files. I guess because it does a separate gcc -c main.cpp -o main.o step?

Anyway I found this command very useful for case 1:

$ dsymutil -dump-debug-map main
---
triple:          'x86_64-apple-darwin'
binary-path:     main
objects:         
  - filename:        /Users/tim/foo/build/CMakeFiles/main.dir/main.cpp.o
    timestamp:       1485951213
    symbols:         
      - { sym: __ZNSt3__111char_traitsIcE11eq_int_typeEii, objAddr: 0x0000000000000D50, binAddr: 0x0000000100001C90, size: 0x00000020 }
      - { sym: __ZNSt3__111char_traitsIcE6lengthEPKc, objAddr: 0x0000000000000660, binAddr: 0x00000001000015A0, size: 0x00000020 }
      - { sym: GCC_except_table3, objAddr: 0x0000000000000DBC, binAddr: 0x0000000100001E2C, size: 0x00000000 }
      - { sym: _main, objAddr: 0x0000000000000000, binAddr: 0x0000000100000F40, size: 0x00000090 }
      - { sym: __ZNSt3__124__put_character_sequenceIcNS_11char_traitsIcEEEERNS_13basic_ostreamIT_T0_EES7_PKS4_m, objAddr: 0x00000000000001F0, binAddr: 0x0000000100001130, size: 0x00000470 }
      - { sym: ___clang_call_terminate, objAddr: 0x0000000000000D40, binAddr: 0x0000000100001C80, size: 0x00000010 }
      - { sym: GCC_except_table5, objAddr: 0x0000000000000E6C, binAddr: 0x0000000100001EDC, size: 0x00000000 }
      - { sym: __ZNSt3__116__pad_and_outputIcNS_11char_traitsIcEEEENS_19ostreambuf_iteratorIT_T0_EES6_PKS4_S8_S8_RNS_8ios_baseES4_, objAddr: 0x0000000000000680, binAddr: 0x00000001000015C0, size: 0x000006C0 }
      - { sym: __ZNSt3__14endlIcNS_11char_traitsIcEEEERNS_13basic_ostreamIT_T0_EES7_, objAddr: 0x00000000000000E0, binAddr: 0x0000000100001020, size: 0x00000110 }
      - { sym: GCC_except_table2, objAddr: 0x0000000000000D7C, binAddr: 0x0000000100001DEC, size: 0x00000000 }
      - { sym: __ZNSt3__1lsINS_11char_traitsIcEEEERNS_13basic_ostreamIcT_EES6_PKc, objAddr: 0x0000000000000090, binAddr: 0x0000000100000FD0, size: 0x00000050 }
      - { sym: __ZNSt3__111char_traitsIcE3eofEv, objAddr: 0x0000000000000D70, binAddr: 0x0000000100001CB0, size: 0x0000000B }
...
0
Albert On

It seems it actually doesn't.

I traced dsymutil and it reads all the *.o files. objdump -h also lists all the debug info in them.

So it seems that those info isn't copied over to the binary.


Some related comments about this can also be found here.

3
Bogatyr On

Apple stores debugging information in separate files named *.dSYM. You can run dwarfdump on those files and see the DWARF Debug Info Entries.