I'm attempting to build a flat 32-bit PIC binary with the following C++ code:
extern "C" {
void print(const char *){}
void entry_func() {
print("abcd\n");
}
}
The assembly produced for the print("abcd\n") bit is:
calll .L1$pb
.L1$pb:
popl %ebx
.Ltmp3:
addl $_GLOBAL_OFFSET_TABLE_+(.Ltmp3-.L1$pb), %ebx
leal .L.str@GOTOFF(%ebx), %eax
movl %eax, (%esp)
calll print@PLT
If I use GNU ld to link a flat binary using this linker script:
SECTIONS {
. = 16M;
.text : ALIGN(4K) {
*(.text)
}
}
I get the following link error:
undefined reference to `_GLOBAL_OFFSET_TABLE_'
First Issue
Given the assembly I showed earlier, should I still expect the linker to produce a GOT even for a flat binary?
In the corresponding object file, I see these two relocations:
Offset Info Type Sym. Value Symbol's Name
0000001d 00000a0a R_386_GOTPC 00000000 _GLOBAL_OFFSET_TABLE_
00000023 00000309 R_386_GOTOFF 00000000 .L.str
Now according to this documentation I found, I would think the linker should emit a GOT:
R_386_GOTOFF
Computes the difference between a symbol's value and the address of the global offset table. It also instructs the link-editor to create the global offset table.
R_386_GOTPC
Resembles R_386_PC32, except that it uses the address of the global offset table in its calculation. The symbol referenced in this relocation normally is GLOBAL_OFFSET_TABLE, which also instructs the link-editor to create the global offset table.
Is this an issue with ld, or perhaps should ld not actually need to emit a GOT because I'm producing a flat binary rather than an ELF binary?
Second Issue
Now I can patch this error by also compiling and linking in a .S file that actually defines this symbol:
.globl _GLOBAL_OFFSET_TABLE_
.section .got,"wa",@progbits
_GLOBAL_OFFSET_TABLE_:
.word 0xabcd // Filler data so it's easier to find in the objdump
This links successfully, but my binary seems to be incorrect when I objdump it:
00000000 <.data>:
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: 8b 45 08 mov 0x8(%ebp),%eax
6: 5d pop %ebp
7: c3 ret
8: 90 nop
...
f: 90 nop
10: 55 push %ebp
11: 89 e5 mov %esp,%ebp
13: 53 push %ebx
14: 50 push %eax
15: e8 00 00 00 00 call 0x1a
1a: 5b pop %ebx
1b: 81 c3 1b 00 00 00 add $0x1b,%ebx
21: 8d 83 38 00 00 01 lea 0x1000038(%ebx),%eax
27: 89 04 24 mov %eax,(%esp)
2a: e8 d1 ff ff ff call 0x0
2f: 83 c4 04 add $0x4,%esp
32: 5b pop %ebx
33: 5d pop %ebp
34: c3 ret
35: cd ab int $0xab
37: 00 61 62 add %ah,0x62(%ecx)
3a: 63 64 0a 00 arpl %sp,0x0(%edx,%ecx,1)
The value for $_GLOBAL_OFFSET_TABLE_+(.Ltmp3-.L1$pb) seems to have expanded correctly: _GLOBAL_OFFSET_TABLE_ has the relocation R_386_GOTPC and is calculated as the offset between the GOT (0x35) and the current PC (0x1b), and (.Ltmp3-.L1$pb) is just 1 byte (so 0x35-0x1b+0x1 = 0x1b).
My second issue is that the value for .L.str@GOTOFF seems to assume the GOT is at address zero. It's corresponding relocation is R_386_GOTOFF which is calculated as the offset between the symbol (.L.str) and the GOT. Now if I had my binary start at 16MB (from the linker script), and the offset for .L.str into the binary is at 0x38, then the location for the symbol is 0x1000038. If so, and the result is 0x1000038 then this implies the GOT is at zero.
My second question: is there a way to manually tell the linker where the GOT is? I'm guessing my _GLOBAL_OFFSET_TABLE_ trick didn't work here because _GLOBAL_OFFSET_TABLE_ probably acts more as a symbol that's emitted to indicate where the GOT actually is rather than the other way around (the linker looking up wherever _GLOBAL_OFFSET_TABLE_ is and placing the GOT there).
My overall goal is to see if I can write a flat PIC binary in pure C/C++ (to a certain extent). I know at least for this small code example that I could circumvent the GOT in pure assembly with something like:
call .L$pb
.L$pb:
pop %ebx
addl $(.L.str - .L$pb), %ebx
movl %ebx, (%esp)
calll print@PLT
Here rather than adding offsets between the PC and GOT, and GOT and .L.str, I just take the offset between the PC and .L.str. This emits a R_386_PC32 for $(.L.str - .L$pb) which can be resolved statically. The result is still PIC, but without the GOT. In a similar way to how the linker can relax PLT relocations to relative calls if the call and function definition are in the same binary, I wonder if there's a way to relax these two GOT relocations to just take the relative reference to my binary-local data correctly.