being sure about "unknown evaluation order"

1.7k views Asked by At

Since version 1.80, Cppcheck tells me that

Expression 'msg[ipos++]=checksum(&msg[1],ipos-1)' depends on order of evaluation of side effects

in this code sequence (simplified, data is a variable)

BYTE msg[MAX_MSG_SIZE];  // msg can be smaller, depending on data encoded
int ipos = 0;
msg[ipos++] = MSG_START;
ipos += encode(&msg[ipos], data);
msg[ipos++] = checksum(&msg[1], ipos-1);  // <---- Undefined Behaviour?
msg[ipos++] = MSG_END;   // increment ipos to the actual size of msg

and treats this as an error, not a portability issue.

It's C code (incorporated into a C++-dominated project), compiled with a C++98-complient compiler, and meanwhile runs as expected for decades. Cppcheck is run with C++03, C89, auto-detect language.

I confess that the code should better be rewritten. But before doing this, I try to figure out: Is it really dependent on evaluation order? As I understand it, the right operand is being evaluated first (it needs to before the call), then the assignment is taking place (to msg[ipos]) with the increment of ipos done last.

Am I wrong with this assumption, or is it just a false positive?

2

There are 2 answers

7
John Zwinck On BEST ANSWER

This code does indeed depend on evaluation order in a way which is not well defined:

msg[ipos++] = checksum(&msg[1], ipos-1);

Specifically, it is not specified whether ipos++ will increment before or after ipos-1 is evaluated. This is because there is no "sequence point" at the =, only at the end of the full expression (the ;).

The function call is a sequence point. But that only guarantees that ipos-1 happens before the function call. It does not guarantee that ipos++ happens after.

It appears the code should be rewritten this way:

msg[ipos] = checksum(&msg[1], ipos-1);
ipos++; // or ++ipos
8
Lundin On

The order of evaluation of the operands of = is unspecified. So to begin with, the code relies on unspecified behavior.

What makes the case even worse is that ipos is used twice in the same expression with no sequence point in between, for unrelated purposes - which leads to undefined behavior.

C99 6.5

Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be read only to determine the value to be stored.

The very same text applies to C90, C99, C++98 and C++03. In C11 and C++11 the wording has changed but the meaning is the same. It is undefined behavior, pre-C++11.

The compiler is not required to give a diagnostic for undefined behavior. You were lucky that it did. It is not a false positive - your code contains a severe bug, all the way from the original C code.