We know that there is padding in some structures in C. Please consider the following 2:

struct node1 {
      int a;
      int b;
      char c;
};

struct node2 {
      int a;
      char c;
      int b;
};

Assuming sizeof(int) = alignof(int) = 4 bytes:
sizeof(node1) = sizeof(node2) = 12, due to padding.

What is the performance difference between the two? (if any, w.r.t. the compiler or the architecture of the system, especially with GCC)

4

There are 4 answers

0
Ruud Helderman On

I would not be surprised if the interviewer's opinion was based on the old argument of backward compatibility when extending the struct in the future. Additional fields (char, smallint) may benefit from the space occupied by the trailing padding, without the risk of affecting the memory offset of the existing fields.

In most cases, it's a moot point. The approach itself is likely to break compatibility, for two reasons:

  1. Starting the extensions on a new alignment boundary (as would happen to node2) may not be memory-optimal, but it might well prevent the new fields from accidentally being overwritten by the padding of a 'legacy' struct.
  2. When compatibility is that much of an issue (e.g. when persisting or transferring data), then it makes more sense to serialize/deserialize (even if binary is a requirement) than to depend on a binary format that varies per architecture, per compiler, even per compiler option.
0
Brendan On

What is the performance difference between the two?

The performance difference is "indeterminable". For most cases it won't make any difference.

For cases where it does make a difference; either version might be faster, depending on how the structure is used. For one example, if you have a large array of these structures and frequently select a structure in the array "randomly"; then if you only access a and b of the randomly selected structure the first version can be faster (because a and b are more likely to be in the same cache line), and if you only access a and c then the second version can be faster.

5
Lundin On

These are bad examples - in this case it doesn't matter, since the amount of padding will be the same in either case. There will not be any performance differences.

The compiler will always strive to fill up trailing padding at the end of a struct or otherwise using arrays of structs wouldn't be feasible, since the first member should always be aligned. If not for trailing padding in some item struct_array[0], then the first member in struct_array[1] would end up misaligned.


The order would matter if we were to do this though:

struct node3 {
      int  a;
      char b;
      int  c;
      char d;
};

Assuming 4 byte int and 4 byte alignment, then b occupies 1+3 bytes here, and d an additional 1+3 bytes. This could have been written better if the two char members were placed adjacently, in which case the total amount of padding would just have been 2 bytes.

0
Vilx- On

OK, I might be completely off the mark here since this is a bit out of my league. If so, please correct me. But this is how I see it:

First of all, why do we need padding and alignment at all? It's just wasted bytes, isn't it? Well, turns out that processors like it. That is, if you issue an instruction to the CPU that operates on a 32-bit integer, the CPU will demand that this integer resides at a memory address which is dividable by 4. For a 64-bit integer it will need to reside in an address dividable by 8. And so on. This is done to make the CPU design simpler and better performant.

If you violate this requirement (aka "unaligned memory access"), most CPUs will raise an exception. x86 is actually an oddity because it will still perform the operation - but it will take more than twice as long because it will fetch the value from memory in two passes rather than one and then do bitwise magic to stick the value together from these separate accesses.

So this is the reason why compilers add padding to structs - so that all the members would be properly aligned and the CPU could access them quickly (or at all). Well, that's assuming the struct itself is located at a proper memory address. But it will also take care of that as long as you stick to standard operations for allocating the memory.

But it is possible to explicitly tell the compiler that you want a different alignment too. For example, if you want to use your struct to read in a bunch of data from a tightly packed file, you could explicitly set the padding to 1. In that case the compiler will also have to emit extra instructions to compensate for potential misalignment.

TL;DR - wrong alignment makes everything slower (or under certain conditions can crash your program entirely).

However this doesn't answer the question "where to better put the padding?" Padding is needed, yes, but where? Well, it doesn't make much difference directly, however by rearranging your members carefully you can reduce the size of the entire struct. And less memory used usually means a faster program. Especially if you create large arrays of these structs, using less memory will mean less memory accesses and more efficient use of CPU cache.

In your example however I don't think there's any difference.

P.S. Why does your struct end with a padding? Because arrays. The compiler wants to make sure that if you allocate an array of these structs, they will all be properly aligned. Because array members don't have any padding between them.