Is casting "pointer to array of type" to "pointer to type" defined?

109 views Asked by At

As in this example:

#include <stdio.h>

void foo(int (*p)[5])
{
    int *i = (int *)p;
    printf("%d\n", *i);
}

Is it defined? IMO it is not.

Question inspired by this question


Please bear in mind that this question is tagged language-lawyer

1

There are 1 answers

9
Eric Postpischil On BEST ANSWER

Nothing in the C 2018 standard defines this, so the behavior is undefined (4 2: “Undefined behavior is otherwise indicated in this document by the words ‘undefined behavior’ or by the omission of any explicit definition of behavior.”).

A cast performs a conversion (6.5.4 5). Conversions are specified in 6.3. Pointer conversions are specified in 6.3.2.3. Paragraph 1 covers pointers to or from void *, which is not the case here. Paragraph 2 covers conversions between differently qualified versions of the same type, which is not the case here. Paragraph 3 covers conversions of null pointer constants, which is not the case here. Paragraph 4 covers conversions of null pointers, which is not the case here. Paragraph 5 covers conversions of integers to pointers, which is not the case here. Paragraph 6 covers conversions to integers, which is not the case here. Paragraph 8 covers conversions of pointers to function types, which is not the case here.

That leaves only paragraph 7:

A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined. Otherwise, when converted back again, the result shall compare equal to the original pointer. When a pointer to an object is converted to a pointer to a character type, the result points to the lowest addressed byte of the object. Successive increments of the result, up to the size of the object, yield pointers to the remaining bytes of the object.

This applies to our case, but it does not tell us what the value of the result is. It tells us when we convert the result back to the original type, we will have a value equivalent to the original pointer. But it is conceivable the converted value is in some encoded form that is not suitable for use for dereferencing.

For example, imagine a version of C designed for debugging in which all pointers are instrumented with provenance information. A pointer to int [2] might be 16 bytes, four of which contain the address and twelve of which contain information that this originated as a pointer of type int [2]. Converting it to int * may yield a pointer with exactly the same bytes, and converting it back would yield the same bytes. Since this is a debugging implementation of C, every time a pointer is dereferenced, it checks the pointer provenance as stored in those bytes. When we use *p, it sees that the bytes say the pointer originated as a pointer to int [2], and p is that type, so it allows the dereference and the program continues.

However, when we use *i, it sees the bytes say the pointer originated as a pointer to int [2], but i is int *, so it prints an error message and stops the program.

This hypothetical program conforms to the C standard:

  • The conversion is allowed, as 6.3.2.3 7 specifies.
  • Converting the pointer back yields a pointer equal to the original and that we can use as we used the original, as 6.3.2.3 7 specifies.
  • Using the pointer as an int * is not specified to work, and it does not work, so that conforms to the standard.

Note further that converting a pointer to a pointer to a character type and using it is specified in the standard to access the bytes representing an object, but our hypothetical C implementation can allow this. It simply approves of the use of any pointer-to-object-type to be used for accessing character lvalues even if the provenance says it originated as some other type.

One may believe the intention of the standard was to allow converted pointers to be used as pointers to their new type, but it does not explicitly say that. And therefore the behavior is technically not defined by the C standard.