x86_64 ABI: disassembly issue

165 views Asked by At

I've got the following C code:

#include <stdio.h>

int function(int a, int b)
{
    int res = a + b;
    return res;
}

int main(){
    function(1,2);
    exit(0);
}

I compile it for x86-64 with gcc 4.8.2 (under Ubuntu 14) and it produces this code:

000000000040052d <function>:
  40052d:       55                      push   %rbp
  40052e:       48 89 e5                mov    %rsp,%rbp
  400531:       89 7d ec                mov    %edi,-0x14(%rbp)
  400534:       89 75 e8                mov    %esi,-0x18(%rbp)
  400537:       8b 45 e8                mov    -0x18(%rbp),%eax
  40053a:       8b 55 ec                mov    -0x14(%rbp),%edx
  40053d:       01 d0                   add    %edx,%eax
  40053f:       89 45 fc                mov    %eax,-0x4(%rbp)
  400542:       8b 45 fc                mov    -0x4(%rbp),%eax
  400545:       5d                      pop    %rbp
  400546:       c3                      retq   

I can't understand some things.

At the beginning we push rbp and save rsp in rbp. Then on the top of then stack (and at %rbp) we've got saved rbp. Then everything below rbp is free space.

But then we place passed parameters from edi and esi at -0x14(%rbp) and below.

But why can't we put them immediately below what rbp/rsp points at? edi and esi are 4 bytes long, why not -0x8(%rbp) and -0xc(%rbp), then? Is it connected with memory alignment?

And why is there a weird saving eax to stack and reading it back before return?

1

There are 1 answers

0
nneonneo On BEST ANSWER

First of all, please note that you're looking at unoptimized compiler output. Compiler output often ends up looking kind of stupid with optimizations turned off because the compiler literally translates every line of C into an equivalent run of assembly without bothering to even do the simplest, most obvious optimizations.

For your first question, the answer is "because that's where your compiler decided the variables should go". There's no better answer - compilers differ widely in their stack allocation schemes. For example, Clang on my machine outputs this instead:

pushq   %rbp
movq    %rsp, %rbp
movl    %edi, -4(%rbp)
movl    %esi, -8(%rbp)
movl    -4(%rbp), %esi
addl    -8(%rbp), %esi
movl    %esi, -12(%rbp)
movl    -12(%rbp), %eax
popq    %rbp
retq

where you can see clearly that a gets stored at -4, b gets stored at -8, and result is stored at -12. This is a tighter packing than what your GCC is giving you, but this is just a quirk of GCC and nothing more.

For your second question, let's just look at how the instructions map to C:


Standard function prologue (setting up the stack frame):

  40052d:       55                      push   %rbp
  40052e:       48 89 e5                mov    %rsp,%rbp

Store two arguments into the stack variables a and b:

  400531:       89 7d ec                mov    %edi,-0x14(%rbp)
  400534:       89 75 e8                mov    %esi,-0x18(%rbp)

Load a and b for a + b

  400537:       8b 45 e8                mov    -0x18(%rbp),%eax
  40053a:       8b 55 ec                mov    -0x14(%rbp),%edx

Actually do a + b

  40053d:       01 d0                   add    %edx,%eax

Set result = (result of a+b)

  40053f:       89 45 fc                mov    %eax,-0x4(%rbp)

Copy result to the return value (return result;)

  400542:       8b 45 fc                mov    -0x4(%rbp),%eax

Actually return:

  400545:       5d                      pop    %rbp
  400546:       c3                      retq   

So you can see that the redundant saving and loading of eax is simply because the save and load correspond to different statements of your original C file: the save is from result = and the load is from return result;.

For comparison, here's Clang's optimized output (-O):

pushq   %rbp
movq    %rsp, %rbp
addl    %esi, %edi
movl    %edi, %eax
popq    %rbp
retq

Much smarter: no stack manipulation, and the entire function body is just the two instructions addl and movl. (Of course, if you declare the function static, then both GCC and Clang will happily detect that the function is never productively used and simply delete it outright.).