I'm trying to create an assembly program with GAS syntax that can access it's variables from .data
section in a position independent way on x86-64 arch with enforcing 32bit arch and IS (%eip
instead of %rip
).
No matter what registers I tried, the best result I got was a Segmentation fault: 11
and even that is for accessing the EIP which I shouldn't be able to do at all, therefore the SF. The best result because that at least told me something other than "meh, it won't do".
I'm compiling the file with gcc
on macOS 10.13.6 mid 2010 Intel Core 2 Duo (that's why clang
probably):
$ gcc --version
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 9.1.0 (clang-902.0.39.2)
Target: x86_64-apple-darwin17.7.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
and passing some options to the linker with this:
gcc -m32 -Wl,-fatal_warnings,-arch_errors_fatal,-warn_commons,-pie test.s
ld: warning: PIE disabled. Absolute addressing (perhaps -mdynamic-no-pic) not allowed in code signed PIE, but used in _main from /whatever.../test-a07cf9.o. To fix this warning, don't compile with -mdynamic-no-pic or link with -Wl,-no_pie ld: fatal warning(s) induced error (-fatal_warnings) clang: error: linker command failed with exit code 1 (use -v to see invocation) 1
test.s
.text
.global _main
_main:
xor %eax, %eax
xor %ebx, %ebx
# lea var1(%esi/edi/ebp/esp), %ebx # can't compile, not PIE
# lea var1(%eip), %ebx # segfault, obvs
# lea (%esp), %ebx # EBX = 17
# lea (%non-esp), %ebx # segfault
# lea 0(%esi), %ebx # segfault
# lea 0(%edi), %ebx # segfault
# lea 0(%ebp), %ebx # EBX = 0
# lea 0(%esp), %ebx # EBX = 17
# lea 0(%eip), %ebx # segfault, obvs
movl (%ebx), %eax
ret
.data
var1: .long 6
.end
I'm running it with ./a.out; echo $?
to check the EAX value from ret
at the end.
I looked at various sources, but mostly it's Intel syntax or one of these questions - 1, 2, 3. I tried to disassemble the simplest C example I could come up with i.e. a global variable + return
from main()
- gcc -S test.c -fPIE -pie -fpie -m32
:
int var1 = 6;
int main() { return var1; }
which basically resulted in:
.section __TEXT,__text,regular,pure_instructions
.macosx_version_min 10, 13
.globl _main ## -- Begin function main
.p2align 4, 0x90
_main: ## @main
.cfi_startproc
## BB#0:
pushl %ebp
Lcfi0:
.cfi_def_cfa_offset 8
Lcfi1:
.cfi_offset %ebp, -8
movl %esp, %ebp
Lcfi2:
.cfi_def_cfa_register %ebp
pushl %eax
calll L0$pb
L0$pb:
popl %eax
movl $0, -4(%ebp)
movl _var1-L0$pb(%eax), %eax
addl $4, %esp
popl %ebp
retl
.cfi_endproc
## -- End function
.section __DATA,__data
.globl _var1 ## @var1
.p2align 2
_var1:
.long 6 ## 0x6
.subsections_via_symbols
This obviously uses MOV as LEA and almost the same instruction as mine except -L0$pb
part that should be +/- like address of _var1
- address of L0$pb
to get into the .data
section.
And yet when I try the same approach with var1
and _main
labels, nothing:
.text
.global _main
_main:
xor %eax, %eax
xor %ebx, %ebx
#movl var1-_main(%ebp), %eax # EAX = 191
#movl var1-_main(%esp), %eax # EAX = 204
#movl var1-_main(%eax), %eax # segfault
ret
.data
var1: .long 6
.end
Any ideas what am I doing wrong?
Edit:
I managed to cut out any unnecessary stuff from the disassembled C example and ended up with this:
.text
.global _main
_main:
pushl %ebp
pushl %eax
calll test
test:
popl %eax
/* var1, var2, ... */
movl var1-test(%eax), %eax
addl $4, %esp
popl %ebp
retl
/**
* how var1(label) - test(label) skips this label
* if it's about address subtracting?
*/
blobbbb:
xor %edx, %edx
.data
var1: .long 6
var2: .long 135
And it kind of doesn't make sense that much to me because according to this guide the caller should 1) push the parameters onto stack (none) 2) call
the label and the callee should actually play with the ESP, EBP and other registers. Also, why do I even need an intermediate label or better said, is there any way without it?
In 32 bit modes, there is no
eip
relative addressing mode as in 64 bit mode. Thus, code likeis not actually legal and doesn't assemble in 32-bit mode. (In 64-bit it will truncate the address to 32 bits). In traditional non-PIE 32-bit binaries, you would just do
which moves the value at the absolute address of
var
toeax
, but that's not possible in PIE binaries as the absolute address ofvar
is unknown at link time.What the linker does know is the layout of the binary and what the distance between labels is. Thus, to access a global variable, you proceed like this:
var
Steps 2 and 3 can be combined using an addressing mode with displacement. Step 1 is tricky. There is only one useful instruction that tells us what the address of a location whose address we don't know is and that's
call
: thecall
instruction pushes the address of the next instruction on the stack and then jumps to the indicated address. If we tellcall
to just jump to the next address, we reduce its functionality to what is essentiallypush %eip
:Note that this use case is special-cased in the CPU's return prediction to not actually count as a function call. Since this is not a real function call, we don't establish a stack frame or similar and we don't have a return for this call. It's just a mechanism to get the value of the instruction pointer.
So from this, we know the address of
Label
. Next we can pop it off the stack and use it to find the address ofvar
:and then we can dereference this to get the content of
var
:In real code, you would merge the addition and the memory operand to save an instruction:
If you want to refer to multiple static variables in one function, you only need to use this trick once. Just use suitable differences:
Note that gcc favours a variant of this idiom to get the content of the instruction pointer. It creates a bunch of functions like this:
which move the return address to the indicated register. This is a special function that does not follow the normal calling convention, one exists for each of
eax
,ebx
,ecx
,edx
,esi
, andedi
, depending on which register gcc wants to use. The code looks like this:gcc uses this code for better performance on CPUs whose return prediction does not account for this fake-call idiom. I don't know which CPUs are actually affected though.
Note lastly that no label is skipped over. I don't quite understand what you mean with respect to
blobbbb
. Which control is supposed to reach this label?Finally, your example should look like this:
Note that the
.end
directive is never needed. Labels that begin with a capitalL
are local labels that do not end up in the symbol table which is why the C compiler likes to use them.