Is it legal to compare dangling pointers?

2.8k views Asked by At

Is it legal to compare dangling pointers?

int *p, *q;
{
    int a;
    p = &a;
}
{
    int b;
    q = &b;
}
std::cout << (p == q) << '\n';

Note how both p and q point to objects that have already vanished. Is this legal?

3

There are 3 answers

38
M.M On

Introduction: The first issue is whether it is legal to use the value of p at all.

After a has been destroyed, p acquires what is known as an invalid pointer value. Quote from N4430 (for discussion of N4430's status see the "Note" below):

When the end of the duration of a region of storage is reached, the values of all pointers representing the address of any part of the deallocated storage become invalid pointer values.

The behaviour when an invalid pointer value is used is also covered in the same section of N4430 (and almost identical text appears in C++14 [basic.stc.dynamic.deallocation]/4):

Indirection through an invalid pointer value and passing an invalid pointer value to a deallocation function have undefined behavior. Any other use of an invalid pointer value has implementation-defined behavior.

[ Footnote: Some implementations might define that copying an invalid pointer value causes a system-generated runtime fault. — end footnote ]

So you will need to consult your implementation's documentation to find out what should happen here (since C++14).

The term use in the above quotes means necessitating lvalue-to-rvalue conversion, as in C++14 [conv.lval/2]:

When an lvalue-to-rvalue conversion is applied to an expression e, and [...] the object to which the glvalue refers contains an invalid pointer value, the behaviour is implementation-defined.


History: In C++11 this said undefined rather than implementation-defined; it was changed by DR1438. See the edit history of this post for the full quotes.


Application to p == q: Supposing we have accepted in C++14+N4430 that the result of evaluating p and q is implementation-defined, and that the implementation does not define that a hardware trap occurs; [expr.eq]/2 says:

Two pointers compare equal if they are both null, both point to the same function, or both represent the same address (3.9.2), otherwise they compare unequal.

Since it's implementation-defined what values are obtained when p and q are evaluated, we can't say for sure what will happen here. But it must be either implementation-defined or unspecified.

g++ appears to exhibit unspecified behaviour in this case; depending on the -O switch I was able to have it say either 1 or 0, corresponding to whether or not the same memory address was re-used for b after a had been destroyed.


Note about N4430: This is a proposed defect resolution to C++14, that hasn't been accepted yet. It cleans up a lot of wording surrounding object lifetime, invalid pointers, subobjects, unions, and array bounds access.

In the C++14 text, it is defined under [basic.stc.dynamic.deallocation]/4 and subsequent paragraphs that an invalid pointer value arises when delete is used. However it's not clearly stated whether or not the same principle applies to static or automatic storage.

There is a definition "valid pointer" in [basic.compound]/3 but it is too vague to use sensibly.The [basic.life]/5 (footnote) refers to the same text to define the behaviour of pointers to objects of static storage duration, which suggests that it was meant to apply to all types of storage.

In N4430 the text is moved from that section up one level so that it does clearly apply to all storage durations. There is a note attached:

Drafting note: this should apply to all storage durations that can end, not just to dynamic storage duration. On an implementation supporting threads or segmented stacks, thread and automatic storage may behave in the same way that dynamic storage does.


My opinion: I don't see any consistent way to interpret the standard (pre-N4430) other than to say that p acquires an invalid pointer value. The behaviour doesn't seem to be covered by any other section besides what we have already looked at. So I am happy to treat the N4430 wording as representing the intent of the standard in this case.


0
supercat On

Historically, there have been some systems where using a pointer as an rvalue might cause the system to fetch some information identified by some bits in that pointer. For example, if a pointer could contain the address of an object's header along with an offset into the object, fetching a pointer could cause the system to also fetch some information from that header. If the object has ceased to exist, the attempt to fetch information from its header could fail with arbitrary consequences.

That having been said, in the vast majority of C implementations, all pointers that were alive at some particular moment in time will forever hold the same relationships with regard to the relational and subtraction operators as they had at that particular time. Indeed, in most implementations if one has char *p, one may determine whether it identifies part of an object identified by char *base; size_t size; by checking whether (size_t)(p-base) < size; such comparison will work even retrospectively if there is any overlap in the objects' lifetime.

Unfortunately, the Standard defines no means by which code can indicate that it requires any of the latter guarantees, nor is there a standard means by which code can ask whether a particular implementation can promise any of the latter behaviors and refuse compilation if it does not. Further, some hyper-modern implementations will regard any use of relational or subtraction operators on two pointers as a promise by the programmer that the pointers in question will always identify the same live object, and omit any code which would only be relevant if that assumption didn't hold. Consequently, even though many hardware platforms would be able to offer guarantees that would be useful to many algorithms, there's no safe way by which code can exploit any such guarantees even if code will never need to run on hardware which does not naturally provide them.

2
user2038893 On

The pointers contain the addresses of the variables they reference. The addresses are valid even when the variables that used to be stored there are released / destroyed / unavailable. As long as you don't try to use the values at those addresses you are safe, meaning *p and *q will be undefined.

Obviously the result is implementation defined, therefore this code example can be used to study the features of your compiler if one doesn't want to dig into to assembly code.

Whether this is a meaningful practice is totally different discussion.