Am I interpreting C order of operations correctly here?

151 views Asked by At

I was puzzled by the fact that CPPReference says that postincrement’s value evaluation is sequenced before its side-effect, but preincrement has no such guarantee.

I have now come up with an example where this matters but I am unsure if my analysis is correct.

As I understand it, these two programs differ in that the first contains UB, and the second does not:

#include <stddef.h>
#include <stdio.h>

int main(void) {
    int arr[] = {0, 1, 2};
    int i = 1;
    int x = ++arr[arr[i]];
}
#include <stddef.h>
#include <stdio.h>

int main(void) {
    int arr[] = {0, 1, 2};
    int i = 1;
    int x = arr[arr[i]]++;
}

My analysis of the expression ++arr[arr[i]] is as follows:

  1. There are these sequenced-before relations:
    • Value computation of i is sequenced before value computation of arr[i]
    • Value computation of arr[i] is sequenced before value computation of arr[arr[i]]
    • Value computation of arr[arr[i]] is sequenced before value computation of ++arr[arr[i]]
  2. The side-effect of ++arr[arr[i]] is unsequenced with respect to these.
  3. The compiler may choose any order satisfying these relations and may delete the value computation of arr[arr[i]] since it is not used.
  4. In any possible ordering, the value computation of arr[arr[i]] refers to the same scalar object as arr[i].
  5. The side effect of ++arr[arr[i]] modifies the scalar object arr[arr[i]], but it is accessed unsequenced to that by arr[i].

However, if we use postincrement instead, we introduce a new sequenced-before relation: The value computation of arr[arr[i]]++ is sequenced before its side-effect. Therefore, by transitivity, the side-effect is no longer unsequenced to arr[i].

However, I am unsure if this is accurate. In particular, I am not sure how exactly evaluation of post-/preincrement is defined. Does it perform a value computation of its operand? If it does, does this mean that ++*ptr is UB while (*ptr)++ is not? If it does not, how is the value computation of the full expression performed—can any operator access the value of an lvalue expression without performing value computation on that expression?

2

There are 2 answers

0
schuelermine On BEST ANSWER

The analysis is incorrect. In particular, it is not the case that the side-effect of ++arr[arr[i]] is unsequenced to the other value computations. This is because the C standard (Note: I have only read a draft of the C standard, N3096) specifies that ++E is equivalent to E += 1:

[6.5.16]
The expression ++E is equivalent to (E+=1), where the value 1 is of the appropriate type.

Further, it specifies that assignment operators incl. augmented assignment operators have their constituent expressions’ value computations sequenced before their side effect:

[6.5.3.1]
The side effect of updating the stored value of the left operand is sequenced after the value computations of the left and right operands.

2
Vlad from Moscow On

The expressions

arr[arr[i]]++;

and

++arr[arr[i]];

are both correct. There is neither undefined behavior.

The subscript operator is defined like

postfix-expression [ expression ]

So the above expressions may be rewritten like

arr[some-expression]++

and

++arr[some-expression]

where some-expression is arr[i].

The subscript operator arr[i] is defined like *( arr + i ) where the array designator is implicitly converted to a pointer to its first element of type int *. The evaluated expression yields lvalue of the i-th element of the array value of which is used in the original subscript operator

arr[some-expression]

and then is evaluated like

*( arr + value-of-some-expression )

that again yields lvalue of the array element of the index value-of-some-expression.

The result of the postfix increment operator is the lvalue of its operand. As a side effect the value of the operand is incremented.

The value of the unary increment operator is the value of its ioerand after its increment.

Pay attention to that postfix operators have higher precedence than unary operators and are evaluated from left to right while unary operators are evaluated from right to left.

Relative to your code you have that the variable x gets the value of the second element of the array that is equal to 1 and as a result of the side effect the value of the second element will be incremented and become equal to 2 due to the postfix subscript operator.

If there is used the unary subscript operator then the variable x will get the value 2 .

As for your statements

Value computation of i is sequenced before value computation of arr[i]

then (the C17 Standard, 6.5 Expressions_

1 An expression is a sequence of operators and operands that specifies computation of a value,86) or that designates an object or a function, or that generates side effects, or that performs a combination thereof. The value computations of the operands of an operator are sequenced before the value computation of the result of the operator.

And )6.5.2.4 Postfix increment and decrement operators)

2 The result of the postfix++ operator is the value of the operand. As a side effect, the value of the operand object is incremented (that is, the value 1 of the appropriate type is added to it). See the discussions of additive operators and compound assignment for information on constraints, types, and conversions and the effects of operations on pointers. The value computation of the result is sequenced before the side effect of updating the stored value of the operand. With respect to an indeterminately-sequenced function call, the operation of postfix++ is a single evaluation. Postfix ++ on an object with atomic type is a read-modify-write operation with memory_order_seq_cst memory order semantics.

And (6.5.3.1 Prefix increment and decrement operators_

2 The value of the operand of the prefix++ operator is incremented. The result is the new value of the operand after incrementation. The expression++E is equivalent to (E+=1). See the discussions of additive operators and compound assignment for information on constraints, types, side effects, and conversions and the effects of operations on pointers.

As for undefined behavior then

2 If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined. If there are multiple allowable orderings of the subexpressions of an expression, the behavior is undefined if such an unsequenced side effect occurs in any of the orderings.

and in the provided programs there is no such a situation. The first program could have undefined behavior if you wrote for example

int x = arr[arr[i]]++ + arr[arr[i]];

In this case the side effect of the expression arr[arr[i]]++ is unsequenced relative to the value computation of the expression arr[arr[i]].

Here are some simple examples of undefined behavior

x = y + y++;
x = y + ++y;
x = y++ + ++y;