fread(): Reading from a file (without alignment) results in skipping of bytes

748 views Asked by At

I have a file and using C I want to read the contents of it using fread() (from stdio.h) and write it into the members of a struct. (In my case there is a 2 byte int at the start followed by a 4 byte int.) But after writing the contents of the file correctly into the first two byte variable of the struct, it skips two bytes before continuing with the second four byte variable.

To demonstrate, I have created a 16 byte file to read from. In Hex it looks like this (Little-endian): 22 11 66 55 44 33 11 11 00 00 00 00 00 00 00 00

With the following code I expect the first variable, twobytes, to be 0x1122 and the second, fourbytes, to be 0x33445566. But instead it prints:

twobytes: 0x1122 
fourbytes: 0x11113344

sizeof(FOO) = 8
&foo     : 0061FF14
&foo.two : 0061FF14
&foo.four: 0061FF18

Skipping bytes 3 and 4 (0x66 & 0x55). Code:

#include <stdio.h>
#include <stdint.h>

int main(void) {

    FILE* file = fopen("216543110.txt", "r");
    if (file==NULL) { return 1; }

    typedef struct
    {
        uint16_t twobytes;
        uint32_t fourbytes;
    }__attribute__((__packed__)) // removing this attribute or just the underscores around packed does not change the outcome
    FOO;
    
    FOO foo;
    
    fread(&foo, sizeof(FOO), 1, file);
    
    printf("twobytes: 0x%x \n", foo.twobytes);
    printf("fourbytes: 0x%x \n\n", foo.fourbytes);

    printf("sizeof(FOO) = %d\n", sizeof(FOO));
    printf("&foo     : %p\n", &foo);
    printf("&foo.two : %p\n", &foo.twobytes);
    printf("&foo.four: %p\n", &foo.fourbytes);
    
    fclose(file);
    return 0;
}

Using a struct with two same size integers works as expected.


So: Using fread() to write into different size variables causes skipping bytes:

22 11 .. .. 44 33 11 11 ...

instead of

22 11 66 55 44 33 ...


I am aware that something about byte alignment is playing a role here, but how does that affect the reading of bytes? If C wants to add padding to the structs, how does that affect the reading from a file? I don't care if C is storing the struct members as 22 11 .. .. 66 55 44 33 ... or 22 11 66 55 44 33 ..., I'm confused about why it fails to read my file correctly.

Also, I am using gcc version 6.3.0 (MinGW.org GCC-6.3.0-1)

3

There are 3 answers

0
Andreas Wenzel On BEST ANSWER

On GCC, when targetting x86 platforms, the

__attribute__((__packed__))

only works on structs with

__attribute__((gcc_struct)).

However, when targetting Microsoft Windows platforms, the default attribute for structs is

__attribute__((ms_struct)).

Therefore, I see three ways to accomplish what you want:

  1. Use the compiler command-line option -mno-ms-bitfields to make all structs default to __attribute__((gcc_struct)).
  2. Explicitly use __attribute__((gcc_struct)) on your struct.
  3. Use #pragma pack instead of __attribute__((__packed__)).

Also, as pointed out in the answer by @chqrlie, there are some other things not ideal in your code. Especially when reading binary data, you should normally open the file in binary mode and not text mode, unless you know what you are doing (which you possibly are, since the file has a .txt extension).

2
chqrlie On

From the output your program produces, it seems the compiler ignores the __attribute__(__packed__) specification.

The gcc online user's guide documents the __attribute__ ((__packed__)) type attribute with an example where this attribute is placed before the { of the definition.

This extension is non standard so it is possible that different compilers or different versions of any given compiler handle it differently depending on the placement choice. If you use gcc, moving the attribute should fix the problem. If you use a different compiler, look at the documentation to figure what it does differently.

Also note these remarks:

  • the file should be opened in binary mode, with "rb",
  • the sizeof(FOO) argument should be cast as (int) for the %d conversion specifier.
  • pointer arguments for %p should be cast as (void *).
  • foo.twobytes has the same address as foo, which is mandated by the C Standard and &foo.fourbytes is located 4 bytes away, which means foo.fourbytes is aligned and there are 2 padding bytes between the 2 members.

Try modifying your code this way:

#include <stdio.h>
#include <stdint.h>

int main(void) {
    FILE *file = fopen("216543110.txt", "rb");
    if (file == NULL) {
        return 1;
    }

    typedef struct __attribute__((__packed__)) {
        uint16_t twobytes;
        uint32_t fourbytes;
    } FOO;
    
    FOO foo;
    
    if (fread(&foo, sizeof(FOO), 1, file) == 1) {
        printf("twobytes : 0x%x\n", foo.twobytes);
        printf("fourbytes: 0x%x\n\n", foo.fourbytes);

        printf("sizeof(FOO) = %d\n", (int)sizeof(FOO));
        printf("&foo     : %p\n", (void *)&foo);
        printf("&foo.two : %p\n", (void *)&foo.twobytes);
        printf("&foo.four: %p\n", (void *)&foo.fourbytes);
    }
    fclose(file);
    return 0;
}
2
etsuhisa On

Since the data structure in memory is different from one in file, It may be better to read the members of struct one by one. For example, there are a way to specify position to read the members of struct with "offsetof". The following reads the members of struct with the fread_members function.

#include <stdio.h>
#include <stdint.h>
#include <stddef.h> /* offsetof */

/* offset and size of each member */
typedef struct {
    size_t offset;
    size_t size;
} MEMBER;

#define MEMBER_ELM(type, member) {offsetof(type, member), sizeof(((type*)NULL)->member)}

size_t fread_members(void *ptr, MEMBER *members, FILE *stream) {
    char *top = (char *)ptr;
    size_t rs = 0;
    int i;
    for(i = 0; members[i].size > 0; i++){
        rs += fread(top + members[i].offset, 1, members[i].size, stream);
    }
    return rs;
}

int main(void) {

    FILE* file = fopen("216543110.txt", "r");
    if (file==NULL) { return 1; }

    typedef struct
    {
        uint16_t twobytes;
        uint32_t fourbytes;
    } FOO;

    MEMBER members[] = {
        MEMBER_ELM(FOO, twobytes),
        MEMBER_ELM(FOO, fourbytes),
        {0, 0} /* terminated */
    };

    FOO foo;

    fread_members(&foo, members, file);

    :