Is it unspecified behavior to compare pointers to different arrays for equality?

3.5k views Asked by At

The equality operators have the semantic restrictions of relational operators on pointers:

The == (equal to) and the != (not equal to) operators have the same semantic restrictions, conversions, and result type as the relational operators except for their lower precedence and truth-value result. [C++03 §5.10p2]

And the relational operators have a restriction on comparing pointers:

If two pointers p and q of the same type point to different objects that are not members of the same object or elements of the same array or to different functions, or if only one of them is null, the results of p<q, p>q, p<=q, and p>=q are unspecified. [§5.9p2]

Is this a semantic restriction which is "inherited" by equality operators?

Specifically, given:

int a[42];
int b[42];

It is clear that (a + 3) < (b + 3) is unspecified, but is (a + 3) == (b + 3) also unspecified?

3

There are 3 answers

3
Johannes Schaub - litb On BEST ANSWER

The semantics for op== and op!= explicitly say that the mapping is except for their truth-value result. So you need to look what is defined for their truth value result. If they say that the result is unspecified, then it is unspecified. If they define specific rules, then it is not. It says in particular

Two pointers of the same type compare equal if and only if they are both null, both point to the same function, or both represent the same address

9
Jerry Coffin On

The result from equality operators (== and !=) produce specified results as long as the pointers are to objects of the same type. Given two pointers to the same type, exactly one of the following is true:

  1. both are null pointers, and they compare equal to each other.
  2. both are pointers to the same object, and they compare equal to each other.
  3. they are pointers to different objects, and they compare not-equal to each other.
  4. at least one is not initialized, and the result of the comparison is not defined (and, in fact, the comparison itself may never happen--just trying to read the pointer to do the comparison gives undefined behavior).

Under the same constraints (both pointers are to the same type of object) the result from the ordering operators (<, <=, >, >=) is only specified if both of them are pointers to the same object, or to separate objects in the same array (and for this purpose, a "chunk" of memory allocated with malloc, new, etc., qualifies as an array). If the pointers refer to separate objects that are not part of the same array, the result is unspecified. If one or both the pointers has not be initialized, you have undefined behavior.

Despite that, however, the comparison templates in the standard library (std::less, std::greater, std::less_equal and std::greater_equal) do all yield a meaningful result, even when/if the built-in operators do not. In particular, they are required to yield a total ordering. As such, you can get ordering if you want it, just not with the built-in comparison operators (though, of course, if either or both of the pointers is un-initialized, the behavior is still undefined).

3
Yttrill On

Since there's confusion on conformance semantics, these are the rules for C++. C uses a completely different conformance model.

  1. Undefined behaviour is an oxymoronic term, it means the translator NOT your program, may do as it pleases. This generally means it can generate code which will also do anything it pleases (but that is a deduction). Where the Standard says behaviour is undefined the text is actually of no significance to the user in the sense that eliding this text will not change the requirements the Standard imposes on translators.

  2. Ill formed program means that unless otherwise specified the behaviour of the translator is rigidly defined: it is required to reject your program and issue a diagnostic message. The primary special case here is the One-Definition Rule, if you breach that your program is ill-formed but no diagnostic is required.

  3. Implementation defined imposes a requirement on the translator that it contain documentation specifying the behaviour explicitly. In this special case Undefined Behaviour can be the result but must be explicitly stated.

  4. Unspecified is a stupid term which means that the behaviour come from a set. In this sense well-defined is just a special case where the set of permitted behaviours contains only one element. Unspecified does not require documentation, so in some sense it also means the same as implementation defined without documentation.

In general, the C++ Standard is a not a Language Standard, it is a model for a language Standard. To generate an actual Standard you have to plug in various parameters. The easiest of these to recognize are the implementation defined limits.

There are a couple of silly conflicts in the Standard, for example, a legitimate translator can reject every apparently good C++ program on the basis that you are required to supply a main() function but the translator only supports identifiers of 1 character. This problem is resolve by the notion of QOI or Quality of Implementation. It basically says, who cares, no one is going to buy that compiler just because it is conforming.

Technically the unspecified nature of operator < when the pointers are to unrelated objects is probably intended to mean: you will get some kind of result which is either true or false but your program will not crash, however this is not the correct meaning of unspecified, so that is a Defect: unspecified imposed a burden on the Standards writers to document the set of allowed behaviours because if the set is open, then it is equivalent to undefined behaviour.

I actually proposed std::less as a solution to the problem that some data structures require keys to be totally ordered, but pointers are not totally ordered by operator <. On most machines using linear addressing less is the same as <, but the less operation on, say, an x86 processor is potentially more expensive.