readdir converts "…" in filename as "à" in C

130 views Asked by At

I am trying to list all files in current directory and sub-directories so I can know the size of a directory by calculating the size of all the files in it. But d_name returns a different file name, which I can't pass to fopen because it changes in the filename to à. I don't know what other characters are changed. I believe getting the d_name in utf-8 can fix this.

Currently the part of the code is this:

DIR *dr;
struct dirent *d;

dr = opendir(foldername);

if (dr != NULL) {
    for (d = readdir(dr); d != NULL; d = readdir(dr)) {
        printf("d name: %s \n", d->d_name);
    }
}

The actual filename is

Write For Us. Quick guide to explain how to send… _ by Domenico Nicoli _ Dev Genius.html

But d_name prints it as:

Write For Us. Quick guide to explain how to sendà _ by Domenico Nicoli _ Dev Genius.html

How can I fix this? I want to open, but it always gives a null pointer. Other files open neatly.

1

There are 1 answers

3
chqrlie On

This seems to be a problem related to the difficult mapping between the filename encoding used by Windows and the UTF-8 encoding used by the Unix emulation for opendir and readdir.

Can you try this debugging code so the encoding returned by readdir can be further analysed:

    DIR *dr;
    struct dirent *d;
    
    dr = opendir(foldername);
    if (dr != NULL) {
        while ((d = readdir(dr)) != NULL) {
            char *p;
            printf("d name: ");
            for (p = d->d_name; *p; p++) {
                unsigned char b = *p;
                if (b >= ' ' && b < 0x7F)
                    printf("%c", b);
                else
                    printf("\\x%02x", b);
            }
            printf("\n");
        }
    }