Legal to initialize uint8_t array with string literal?

2.3k views Asked by At

Is it OK to initialize a uint8_t array from a string literal? Does it work as expected or does it mangle some bytes due to signed-unsigned conversion? (I want it to just stuff the literal's bits in there unchanged.) GCC doesn't complain with -Wall and it seems to work.

const uint8_t hello[] = "Hello World"

I am using an API that takes a string as uint8_t *. Right now I am using a cast, otherwise I would get a warning:

const char* hello = "Hello World\n"
HAL_UART_Transmit(uart, (uint8_t *)hello, 12, 50);
// HAL_UART_Transmit(uart, hello, 12, 50);
// would give a warning such as:
// pointer targets in passing argument 2 of 'HAL_UART_Transmit' differ in signedness [-Wpointer-sign]

On this platform, char is 8 bits and signed. Is it under that circumstance OK to use uint8_t instead of char? Please don't focus on the constness issue, the API should take const uint8_t * but doesn't. This API call is just the example that brought me to this question.


Annoyingly this question is now closed, I would like to answer it myself. Apologies for adding this info here, I don't have the permission to reopen.

All of the following work with gcc -Wall -pedantic, but the fourth warns about converting signed to unsigned. The bit pattern in memory will be identical, and if you cast such an object to (uint8_t *) it will have the same behavior. According to the marked duplicate, this is because you may assign string literals to any char array.

const char string1[] = "Hello";
const uint8_t string2[] = "Hello";
uint8_t string3[] = "Hello";
uint8_t* string4 = "Hello";
char* string5 = "Hello";

Of course, only the first two are recommendable, since you shouldn't attempt to modify string literals. In the concrete case above, you could either create a wrapper function/macro, or just leave the cast inside as a concession to the API and call it a day.

1

There are 1 answers

0
Eric Postpischil On

C 2018 6.7.9 14 tells us “An array of character type may be initialized by a character string literal or UTF–8 string literal…”

C 2018 6.2.5 15 tells us “The three types char, signed char, and unsigned char are collectively called the character types.”

C 2018 6.2.5 4 and 6.2.5 6 says there may be extended integer types.

There is no statement that any extended integer types are character types.

C 2018 7.20 4 tells us “For each type described herein that the implementation provides, <stdint.h> shall declare that typedef name…” and 7.20.1 5 tells us “When typedef names differing only in the absence or presence of the initial u are defined, they shall denote corresponding signed and unsigned types as described in 6.2.5…”

Therefore, a C implementation could provide an unsigned 8-bit type that is an extended integer type, not an unsigned char, and may define uint8_t to be this type, and then 6.7.9 14 does not tell us that an array of this type may be initialized by a character string literal.

If an implementation is allowing you to initialize an array of uint8_t with a string literal, then either it defines uint8_t to be an unsigned char or to be unsigned char, or it defines uint8_t to be an extended integer type but allows you to initialize the array as an extension to the C standard. It would be up to the C implementation to define the behavior of that extension, but I would expect it to work just as initializing for an array of character type.

(Conceivable, defining uint8_t to be an extended integer type and disallowing its treatment as a character type could be useful for distinguish the character types, which are allowed to alias any objects, from pure integer types, which would not allow such aliasing. This might allow the compiler to perform additional optimizations, since it would know the aliasing could not occur, or possibly to diagnose certain errors.)

The elements of a string literal have type char (by C 2018 5.2.1 6). C 2018 6.7.9 14 tells us that “Successive bytes of the string literal… initialize the elements of the array.” Each byte should initialize an array element in the usual way, including conversion to the destination type per C 2018 6.7.9 11. For the string you show, "Hello World", the character values are all non-negative, so there is no issue in converting their char values to uint8_t. If you had negative characters in the string, they should be converted to uint8_t in the usual way.

(If you have octal or hexadecimal escape sequences that have values not represented in a char, there could be some language-lawyer weirdness in the initialization.)