How does a function call work?

16.7k views Asked by At

I am thinking about how a function call works in assembler. Currently I think it works like:

push arguments on stack
push eip register on stack and setting new eip value over jump  # call instruction

# callee's code
push ebp register on stack
working in the function
returning from function
pop ebp
pop eip       # ret instruction

so but now I am thinking about it, how does assembler save the current stack pointer?

For example if I have some local variables the esp(stack pointer) goes down and if I come back to the main function assembler has to set the esp pointer to the right place but how does this work ?

2

There are 2 answers

10
599644 On

Have a look at the Calling conventions page on wikipedia.

Stack before call:

0x8100 - +------------+ <- ESP
...... - |            |
...... - |            |
0x8000 - +------------+ <- EBP
...... - |            |
...... - | Cur. Frame |
...... - |            |
...... - +------------+

push arguments
push eip register on stack
push ebp register on stack


0x8100 - +------------+ <- ESP
...... - |            |
...... - |            |
0x8000 - +------------+ 
...... - |            |
...... - | Old Frame  |
...... - |            |
...... - +------------+ <- EBP
...... - | Arguments  |
...... - | EIP        |
...... - | 0x8000     | <- Old EBP
...... - +------------+ 

pop ebp
pop eip

0x8100 - +------------+ <- ESP
...... - |            |
...... - |            |
0x8000 - +------------+ <- EBP
...... - |            |
...... - |  Frame     | <- Current again frame!
...... - |            |
...... - +------------+ 
...... - |            |
...... - | Popped     |
...... - |            |
...... - +------------+ 
35
Peter Cordes On

It was hard to figure out what you were missing, but I think what you're missing is that the caller has to fix the stack after the called function returns. The caller knows how much it pushed before the call, so it can add esp, some_constant after the call instruction to clear the args from the stack, putting ESP back to where it was before the first push.


ESP is call-preserved in all calling conventions. Called functions aren't allowed to return with ESP different from what it was before the call. If they return with ret, this could only happen if they copied the return address somewhere else on the stack before running ret! So it's a pretty obvious restriction that some calling-convention descriptions fail to mention.

Anyway, this means that the caller can assume ESP wasn't modified, so it can save/restore anything else with PUSH/POP.

EBP is also call-preserved in all calling conventions I'm aware of. See https://stackoverflow.com/tags/x86/info (the tag wiki) for calling convention/ABI docs.

Also calling conventions on Wikipedia for short summaries.


Also, your pseudo-code for a function call was really weird and confusing (before I edited the question). It didn't clearly show the boundary between the caller's code and the callee's code. In a previous version of this answer, I thought you were saying the caller's code was pushing EBP, because that came before the working in the function line.

EIP isn't directly accessible, and can only be modified by jump instructions. CALL pushes a return address and then jumps (note that it pushes the address of the next instruction, so it doesn't run again on return. EIP during the execution of an instruction could be said to point at the next instruction, since relative jumps are encoded with a displacement from the end of the instruction. Same for x86-64 RIP-relative addresses.)

RET pops into EIP. For it to return to the right place, the code has to restore ESP to pointing at the return address pushed by the caller.

Assuming a 32-bit stack-args calling convention like System V i386, I'd write your pseudocode as:

(optional) push ecx or whatever call-clobbered registers you want to save
push arguments on stack
CALL function (pushes a return address, i.e. the addr of the insn after the call)

  # code of the called function
  (optional) push ebp   (and any other call-preserved regs the function wants to use)
  working in the function
  (optional) pop  ebp   (and any other regs, in reverse order of pushing)
  RET (pops the return address into EIP)

add esp, 8 (for example) to clear args from the stack
(optional) pop  ecx   or whatever other volatile regs you want to restore

Look at the compiler-generated asm for a real function sometime, like this:

Try with different compiler options or change the source on the Godbolt compiler explorer:

int extern_func(int a);

int foo() {
  int a = extern_func(2);
  int b = extern_func(5);
  return a+b;
}

Compiled with gcc6.2 -m32 -O3 -fno-omit-frame-pointer to make 32-bit code which uses EBP the way you're assuming, instead of the default omit-frame-pointer mode. I could have used -O0, but un-optimized asm is so bloated that it sucks to read, and there's nothing confusing that gcc can do here. Also used -fverbose-asm to get it to mark variable names on operands.

foo:
    push    ebp
    mov     ebp, esp              # standard prologue
    push    ebx                   # save ebx so we have a call-preserved register
    sub     esp, 16               # reserve space for locals
    push    2                     # the arg for the first function call
    call    extern_func
    mov     ebx, eax  # a,        # stash the return value where it won't be clobbered by the next call
    mov     DWORD PTR [esp], 5        # just write the new arg to the stack, instead of add esp, 4  and push 5
    call    extern_func     #
    add     eax, ebx  # tmp90, a     # this is a+b as the return value
    mov     ebx, DWORD PTR [ebp-4]    #, ESP isn't pointing to where we pushed EBX, so restore it with a normal MOV load.
    leave                             # and set esp=ebp and pop ebp
    # at this point, ESP is back to its value on entry to the function
    ret

clang makes some different choices about how to do things (including using esi instead of ebx), and does the epilogue with

    add     eax, esi
    add     esp, 4
    pop     esi
    pop     ebp
    ret

So it's a more "normal" sequence: restore ESP to pointing at the registers pushed in the prologue and pop them, again leaving ESP pointing at the return address ready for RET.