is<thing> equivalents for char32_t

307 views Asked by At

Are there any equivalents for the char functions (ispace, isalpha, etc) defined in <ctype> for char32_t?

I had a look around & could only see iswspace (& related) which seem like those are for 16bit chars.

Note: while isspace takes a int as a parameter it seems to produce erroneous results for Unicode characters.

Example:

char32_t dagger = U'';
if (isspace(dagger)) {
    puts("That is a space!");
}

Will output "That is a space!"

2

There are 2 answers

0
5andr0 On BEST ANSWER

Up to wchar_t you can use std::isalpha with the suitable locale defined in in <locale>.

For anything above 0xFFFF you will need the ICU library:

u_isalpha or u_isUAlphabetic

u_isspace or u_isUWhiteSpace

Full list of functions: uchar.h

5
Nicol Bolas On

While C++-the-language has facilities for generating Unicode values, C++-the-library is almost completely deaf to Unicode. <ctype.h> and <cctype> have no idea how to handle Unicode values; their functionality is based on the C locale mechanism. Your implementation may provide locales that know what Unicode is, but the "C" locale that is the default is not one of them.