C ABI with LLVM

3.2k views Asked by At

I've got a compiler written with LLVM and I'm looking to up my ABI compliance. For example, I've found it hard to actually find specification documents for C ABI on Windows x86 or Linux. And the ones I have found explain it in terms of RAX/EAX/etc, rather than IR terms that I can use.

So far, I think I've figured that LLVM treats aggregates invisibly- that is, it considers their members as a distinct parameter each. So for example, on Windows x64, if I want to handle an aggregate like the document says, I'll need to coerce to a single integer of that size, if 8, 16, 32, or 64 bits. Otherwise, pass by pointer.

For Windows x86, it seems like __cdecl and __stdcall don't need any action from me as all parameters are passed on the stack. __fastcall says that the first two 32bit or smaller arguments are register-passed so I'll need to coerce aggregates of that size or less. __thiscall passes this in a register, and the rest on the stack, so it seems like I won't need to perform any adjustment here.

For __vectorcall, pass aggregates not more than sizeof(void*) by integer coercion. For other aggregates, if they are HVAs then pass by value; else pass by value on x86 or pass by pointer on x64.

This seems simple (well, relatively), but the LLVM docs for sext clearly state "This indicates to the code generator that the parameter or return value should be sign-extended to the extent required by the target’s ABI (which is usually 32-bits) by the caller (for a parameter) or the callee (for a return value).". The Microsoft pages for the x86 calling conventions mention nothing about extending anything to any width.

And I've observed the LLVM IR generated by Clang that generates the byval attribute on Windows. The understanding I've gleaned from the above never calls for byval's usage.

How would I lower the various platform C ABIs to LLVM IR?

2

There are 2 answers

9
Eli Bendersky On BEST ANSWER

I can't say I understand your question 100%, but it's worth noting that LLVM IR simply can not represent all the subtleties of platform ABIs. Therefore, in the Clang toolchain, it is the frontend that's responsible for performing ABI lowering, such as properly passing objects by value to functions, etc.

Take a look at lib/Basic/Targets.cpp in the Clang source tree for the definitions. The gory details are further in lib/CodeGen/TargetInfo.cpp

0
Puppy On

I ended up hacking Clang's CodeGen internals to perform C ABI calling for me (C++ ABI support was already done). Thus instead of having to re-implement (and re-test) their code, I simply re-used their work. Officially the CodeGen APIs aren't public and aren't meant to be used by anyone, but in this case, I managed to make it work. It turns out that it's a lot less scary than it looks- many of the classes like LValue/RValue/ReturnValueSlot are just wrappers on llvm::Value* with a couple extra optional semantics tacked on.

More problematic will be creating trampolines from C ABI to my own ABI. The CodeGenFunction interface doesn't seem quite as amenable to that. But I think I can make it work.