Linux setxattr: possible to use Unicode string?

294 views Asked by At

I wrote the following code in VS Code and ran it to set file attribute. It seemed to have run successfully, but when I checked the value, the text was not correct. Is Unicode string supported for file extended attributes? If so, how can I fix the code below?

#include <stdio.h>
#include <sys/xattr.h>

int main()
{
    printf("ねこ\n");
    ssize_t res = setxattr("/mnt/cat/test.txt", "user.dog"
    , "ねこ", 2, 0); /*also tested 4 and 8*/
    printf("Result = %lu\n", (unsigned long)res);
    return 0;    
}

Programme output

ねこ
Result = 0

Reading attribute

$ getfattr test.txt  -d
# file: test.txt
user.dog=0s44E=
1

There are 1 answers

0
phuclv On BEST ANSWER

Obviously ねこ can't be stored in 2 bytes. The characters are U+306D and U+3053, encoded in UTF-8 as E3 81 AD E3 81 93 so length must be set to 6. If you did that you'll see that getfattr test.txt -d outputs

user.dog=0s44Gt44GT

That's because -d doesn't what format the data is in and just dumps it as binary. The 0s prefix means that the data is in base64 as stated from the manpage:

  • -d, --dump

    • Dump the values of all matched extended attributes.
  • -e en, --encoding=en

    • Encode values after retrieving them. Valid values of en are "text", "hex", and "base64". Values encoded as text strings are enclosed in double quotes ("), while strings encoded as hexidecimal and base64 are prefixed with 0x and 0s, respectively.

Just plug 44Gt44GT into any base64 decoder or run echo 44Gt44GT | base64 --decode and you'll see the correct string printed out. To see the string directly from getfattr you need to specify the format with -e text

$ getfattr -n user.dog -e text test.txt
# file: test.txt
user.dog="ねこ"