Overhead when creating/expanding structure in function call

Question

Overhead when creating/expanding structure in function call

75 views Asked by G. B. At 13 June 2023 at 14:33

I have to create two versions of the same function: one with all parameters listed, one with parameters passed as a struct. The number of parameters is arbitrary. I implement the functionality in only one of them, the other is just calling it with expanded parameters or initialized structure.

Is there a difference in the overhead between the two versions below?

Version 1

int functionWithStructure(MyStructure a)
{
    return functionWithMultipleParams(a.Myparam1, a.Myparam2);
}

int functionWithMultipleParams(int param1, int param2)
{
    return /* implement something */;
}

Version 2

int functionWithMultipleParams(int param1, int param2)
{
    return functionWithStructure((MyStructure) {param1, param2});
}

int functionWithStructure(MyStructure a)
{
    return /* implement something */;
}

Original Q&A

There are 1 answers

**Jan Schultke** · Answer 1 · 2023-06-13T14:50:27+00:00

You can't say that one version is always better than the other. Sometimes it is better to pack parameters into a struct, and sometimes it is worse.

In the x86_64 ABI, there is a difference between passing 2x int and a single struct parameter.

in the former case, each int is passed via a separate register edi, esi
in the latter case, the struct members are packed into a single register rdi

As a rule of thumb, a struct is better when we perform operations with the whole struct (like passing it to other functions), whereas separate parameters are better when using them in separate ways.

Positive Cost `struct`

struct point {
    int x;
    int y;
};

int sum(int x, int y) {
    return x + y;
}

int struct_sum(struct point p) {
    return p.x + p.y;
}

Which produces: (GCC 13 -O2)

sum:
        lea     eax, [rdi+rsi]
        ret
struct_sum:
        mov     rax, rdi
        shr     rax, 32
        add     eax, edi
        ret

You can see that sum simply computes the sum of rdi and rsi, whereas struct_sum first has to unpack the operands into separate registers, since they both start in rdi.

Negative Cost `struct`

struct point {
    int x;
    int y;
};

struct point lowest_bit(int x, int y) {
    return (struct point) {x & 1, y & 1};
}

struct point struct_lowest_bit(struct point p) {
    return (struct point) {p.x & 1, p.y & 1};
}

Which procudes: (clang trunk -O2)

lowest_bit:
        and     edi, 1
        and     esi, 1
        shl     rsi, 32
        lea     rax, [rdi + rsi]
        ret
struct_lowest_bit:
        movabs  rax, 4294967297
        and     rax, rdi
        ret

Note: GCC doesn't find this optimization for some reason.

In this case, it's better for both members to be packed into rdi, because performing & 1 with either one of them can be parallelized this way.

Also see: C++ Weekly - Ep 119 - Negative Cost Structs (C++ video, but equally applies to C due to similar ABI).

TechQA.

Overhead when creating/expanding structure in function call

Version 1

Version 2

There are 1 answers

Positive Cost `struct`

Negative Cost `struct`

Related Questions in C

Related Questions in FUNCTION

Related Questions in PERFORMANCE

Related Questions in CALLING-CONVENTION

Related Questions in OVERHEAD

Popular Questions

Trending Questions

Overhead when creating/expanding structure in function call

Version 1

Version 2

There are 1 answers

Positive Cost struct

Negative Cost struct

Related Questions in C

Related Questions in FUNCTION

Related Questions in PERFORMANCE

Related Questions in CALLING-CONVENTION

Related Questions in OVERHEAD

Popular Questions

Trending Questions

Positive Cost `struct`

Negative Cost `struct`