Pointer comparison in C

124 views Asked by At

If I allocated something like

 size_t n = ???;
 unsigned char* s = malloc(n);

will it be perfectly defined behaviour to compare pointers to locations s + i for 0 <= i < n in the sense that s + i < s + j if and only if i < j? Probably it is, but one reads that pointer comparison is defined only in a contagious array, and to a beginner it is not clear if an allocated thing as above would count as an "array" since that term has also a formal meaning in C, and also one reads about virtual memory without fully yet digesting that, and starts to worry... So I thought to ask to make sure.

2

There are 2 answers

8
Lundin On

The C standard has its problem here. Because what's returned from malloc is just a chunk of memory with no declared type. In theory it is not an array or any other type (yet). In practice the compiler must treat it like an array or otherwise the whole C language falls apart. Nobody thought of this when the C99 standard was designed and ISO C working groups since then have shied away from fixing the problem.

To tell if something is a valid access, we need to know its type, or if there is no declared type like in the malloc case, then at least we need to know the effective type, which is a system that the C standard launched with C99 to address such scenarios.

Formally, s points at memory location having no declared type. C17 6.5 §6 then says:

For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access.

But you don't access it, so it has no effective type either. And since it has neither a declared nor an effective type, it cannot be an array of some type either. Doing pointer arithmetic on s which is not an array, is undefined behavior per C17 6.5.6 §8. Unless you access it first and thereby mark it as some sort of effective type.

Obviously we can't read the C standard literally here - it is broken. Specifically, it has the following defects:

  • 6.5 §6 does not address how to treat "aggregate types" (arrays, structs) nor how to treat type qualifiers (const, volatile), in regards of effective type.
  • 6.5.6 §8 does not support pointer arithmetic on items with no effective type.

In order to produce some sort of meaningful executable, compilers therefore ignore all of this and treat whatever is returned by malloc as an array. Similarly, compilers tend to support pointer arithmetic on areas with no declared type just fine, or hardware-related programming in C would also be impossible.

So to summarize:

  • Will it be perfectly defined behavior to do pointer arithmetic within this allocated chunk?
    No.

  • Can you compare two pointers pointing into the chunk?
    Yes, you can always compare two pointers in C no matter where they point. But how will you get the pointers pointing into this chunk with no type, without using pointer arithmetic?

  • Will it work just fine on every half-decent compiler ever released?
    Yes.

12
Vlad from Moscow On

From the C Standard (3. Terms, definitions, and symbols)

3.15

1 object region of data storage in the execution environment, the contents of which can represent values

2 NOTE When referenced, an object may be interpreted as having a particular type; see 6.3.2.1.

and (7.22.3 Memory management functions)

1 The order and contiguity of storage allocated by successive calls to the aligned_alloc, calloc, malloc, and realloc functions is unspecified. The pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to a pointer to any type of object with a fundamental alignment requirement and then used to access such an object or an array of such objects in the space allocated (until the space is explicitly deallocated).

also this quote will be useful here (6.2.5 Types)

20 Any number of derived types can be constructed from the object and function types, as follows:

— An array type describes a contiguously allocated nonempty set of objects with a particular member object type, called the element type. The element type shall be complete whenever the array type is specified. Array types are characterized by their element type and by the number of elements in the array. An array type is said to be derived from its element type, and if its element type is T, the array type is sometimes called ‘‘array of T’’. The construction of an array type from an element type is called ‘‘array type derivation’’.

and at last *6.5.6 Additive operators)

8 When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N (where N has the value n) point to, respectively, the i+n-th and i−n-th elements of the array object, provided they exist. Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated.