Does comparing a pointer that has been free'd invoke UB?

543 views Asked by At

This seems to be a fairly common pattern e.g. in hexchat (may not compile, see also plugin docs. also note that hexchat_plugin_get_info hasn't been used in forever so I'm omitting it for simplicity):

static hexchat_plugin *ph;
static int timer_cb(void *userdata) {
    if (hexchat_set_context(ph, userdata)) { /* <-- is this line UB? */
        /* omitted */
    }
    return 0;
}
static int do_ub(char *word[], char *word_eol[], void *userdata) {
    void *context = hexchat_get_context(ph);
    hexchat_hook_timer(ph, 1000, timer_cb, context);
    hexchat_command(ph, "close"); /* free the context - in practice this would be done by another plugin or by the user, not like this, but for the purposes of this example this simulates the user closing the context. */
    return HEXCHAT_EAT_ALL;
}
int hexchat_plugin_init(hexchat_plugin *plugin_handle, char **plugin_name, char **plugin_desc, char **plugin_version, char *arg) {
    *plugin_name = "do_ub";
    *plugin_desc = "does ub when you /do_ub";
    *plugin_version = "1.0.0";
    ph = plugin_handle;
    /* etc */
    hexchat_hook_command(ph, "do_ub", 0, do_ub, "does UB", NULL);
    return 1;
}

The line in timer_cb causes hexchat to compare the (potentially free'd - definitely free'd in this example, see the comment in do_ub) pointer with another pointer, if you follow from here (plugin.c#L1089, hexchat_set_context) you'll end up in here (hexchat.c#L191, is_session). To invoke this code, run /do_ub in hexchat.

Relevant code:

int
hexchat_set_context (hexchat_plugin *ph, hexchat_context *context)
{
    if (is_session (context))
    {
        ph->context = context;
        return 1;
    }
    return 0;
}

int
is_session (session * sess)
{
    return g_slist_find (sess_list, sess) ? 1 : 0;
}

Is this sort of thing UB?

3

There are 3 answers

7
Eugene Sh. On BEST ANSWER

Using a value of a pointer after the object it is pointing to have reached it's lifetime end is indeterminate as stated in the C11 Standard draft 6.2.4p2 (Storage durations of objects) (the emphasis is mine):

The lifetime of an object is the portion of program execution during which storage is guaranteed to be reserved for it. An object exists, has a constant address, and retains its last-stored value throughout its lifetime. If an object is referred to outside of its lifetime, the behavior is undefined. The value of a pointer becomes indeterminate when the object it points to (or just past) reaches the end of its lifetime.

And using it's value (just for anything) is an explicit undefined behavior as stated in Annex J.2(Undefined behavior):

The behavior is undefined in the following circumstances: [...] The value of a pointer to an object whose lifetime has ended is used (6.2.4).

2
Steve Summit On

Yes, using a pointer value that has been freed for anything -- even a seemingly-innocuous comparison -- is, strictly speaking, undefined behavior. It's unlikely to cause any actual problems in practice, but I'd say it's worth avoiding.

See also the C FAQ list, question 7.21.

2
supercat On

tl;dr: The ability to perform certain operations such as comparisons on pointers without regard for the lifetime of objects identified thereby is a popular extension which the vast majority of compilers can be configured to support with optimizations disabled. Support for it is not mandated by the Standard, however, and aggressive optimizers may break code which relies upon it.

When the Standard was written, there were some segmented-memory platforms where attempting to load a pointer into registers would cause the system to retrieve information about the region of memory where the pointer resided. If such information was no longer available, an attempt to retrieve it could have arbitrary consequences outside the jurisdiction of the Standard. For the Standard to require that comparisons involving such pointers have no side effects beyond yielding 0 or 1 would have made the language impractical on such platforms.

While the authors of the Standard were no doubt aware that being able to use comparisons with arbitrary pointers (subject to the caveat that the results may not be particularly meaningful) was a useful feature supported by every implementation targeting conventional hardware, they saw no need to treat it as anything more than a "popular extension" which quality implementations support whenever doing so would be useful and practical.

From C89 Rationale, p.11 line 23:

The terms unspecified behavior, undefined behavior, and implementation-defined behavior are used to categorize the result of writing programs whose properties the Standard does not, or cannot, completely describe. The goal of adopting this categorization is to allow a certain variety among implementations which permits quality of implementation to be an active force in the marketplace as well as to allow certain popular extensions, without removing the cachet of conformance to the Standard. Informative Annex J of the Standard catalogs those behaviors which fall into one of these three categories.

Unfortunately, even though nearly all platforms in use today could support such semantics at essentially zero cost (*), some compiler writers regard their desire to assume that code will never do anything with freed pointers as more important than any value that programmers could receive from what had been an essentially-universally-supported extension on conventional platforms. Unless one can guarantee that anyone using one's code will disable phony "optimizations" imposed by the the authors of over-eager optimizers who seek to rid the language of useful extensions, one may have to write extra code to work around the absence of such extensions.

(*) In some scenarios where a function exposes to outside code multiple pointers to a region of storage it has allocated and freed, a compiler that had to uphold a behavioral guarantee that they such pointers will compare equal would be consequently required to actually perform store operations that would leak the pointers; treating the pointers as indeterminate would allow the stores to be eliminated. Outside of contrived scenarios, however, the cost savings from eliminating such stores with pointers that are leaked to the outside world would rarely have any meaningful effect on performance.