When is an expression formally evaluated?

167 views Asked by At

As far as I know, evaluating an expression X means determining what's the value that the expression X yield.

But I have a question about when is an expression is evaluated? Specifically, when are the expressions of a class type are evaluated?

For example:

struct S
{
  int x = 42;
  int& ref;
  S(): ref(x) // is the expression 'ref' evaluated in this context?
  { 
    this->x = 10;  // is the expression 'this' evaluated in this context?
  };

};

int main()
{

  S a{ };
  S b{ a }; // is the expression 'a' evaluated in this context? 
           // if yes, what is the value that the expression 'a' yield?

}

I want to know the formal theory behind expression evaluation.

1

There are 1 answers

10
Brian Bi On

In C++ we don't say the first expression is evaluated and then the second expression. We say "every value computation and side effect associated with the first expression is sequenced before every value computation and side effect associated with the second expression", and I think that's beautiful. --me, just now


There is not a completely satisfying answer to the general question of "when is an expression evaluated?" However, I will try to provide an answer to your question because I believe that if people see that this question doesn't have an answer, they will think that C++ is too complicated to be worth using.

The order of evaluation of expressions in a C++ program is not specified through any kind of formal description, but must be pieced together and often inferred from various different kinds of phrasings that are not as explicit as you might hope.

For example, what happens when you call a function? The standard specifies in [expr.call] that each parameter is initialized with its corresponding argument (p7) but where does it specify that, after this has been done, the first statement in the body of the function is executed? The closest thing we have is [intro.execution]/11:

When calling a function (whether or not the function is inline), every value computation and side effect associated with any argument expression, or with the postfix expression designating the called function, is sequenced before execution of every expression or statement in the body of the called function. [...]

To be honest, this is as clear as mud. What are we to understand from "before execution of every expression or statement in the body of the called function"? Does it mean that after the parameters are initialized from the argument expressions and the postfix expression designating the called function is evaluated, every expression or statement in the body is executed? No, it doesn't; after all, control flow constructs can cause some statements to be skipped. So how do we even know that the starting point is the first statement (after which either control flow constructs or, in their absence, the lexical ordering of statements determines what happens next)? You sort of just have to infer it: if the starting point were the second statement, it would violate [stmt.pre]/1, which states that "except as indicated, statements are executed in sequence". (This is not very clear. One could easily argue that it only means that if both statements are executed, then the lexically first one gets executed first, but that's not the same as saying that if the second one gets executed, the first one must have been executed before.)

If you are looking for "the formal theory behind expression evaluation", I feel that you will be sorely disappointed.


All right, let's assume the things that we know to be obvious, and I'll address the specifics in your question.

Is a evaluated in the declaration of b? Yes it is. Because the standard states that "executing a program starts a main thread of execution in which the main function is invoked" ([basic.start.main]/1), and we can assume (see above) that this means the declaration statement for a will be evaluated, then the declaration statement for b. (As above, [stmt.pre]/1 states that "except as indicated, statements are executed in sequence". Thanks to Nicol Bolas for pointing this out.)

The meaning of the declaration statement for b is given by [stmt.dcl]/2:

Variables with automatic storage duration (6.7.5.4) are initialized each time their declaration-statement is executed. Variables with automatic storage duration declared in the block are destroyed on exit from the block (8.7).

So b, having automatic storage duration, is initialized. The meaning of this initialization is given by [dcl.init.general]/17.1, which states that the object is list-initialized, and this then takes us to [dcl.init.list]/3.9:

Otherwise, if the initializer list has a single element of type E and either T is not a reference type or its referenced type is reference-related to E, the object or reference is initialized from that element (by copy-initialization for copy-list-initialization, or by direct-initialization for direct-list-initialization); if a narrowing conversion (see below) is required to convert the element to T, the program is ill-formed.
[Example 8 :

int x1 {2};    // OK
int x2 {2.0};  // error: narrowing

— end example]

This is a direct-list-initialization, so b is direct-initialized from a. For the meaning of this, we have to go back to [dcl.init.general]/17.6.2:

Otherwise, if the initialization is direct-initialization, or if it is copy-initialization where the cv-unqualified version of the source type is the same class as, or a derived class of, the class of the destination, constructors are considered. The applicable constructors are enumerated (12.4.2.4), and the best one is chosen through overload resolution (12.4). Then:

  • If overload resolution is successful, the selected constructor is called to initialize the object, with the initializer expression or expression-list as its argument(s).
  • ...

This results in the call to S's implicitly declared copy constructor, which is specified elsewhere in the standard to have the same behaviour as

S::S(const S& other) : x(other.x), ref(other.ref) {}

A function call results in the initialization of the parameters from the corresponding arguments ([expr.call]/7), so other is initialized from a. [dcl.init.general]/15 specifies that the type of initialization this performs is copy-initialization. [dcl.init.ref]/5.1 governs this initialization:

If the reference is an lvalue reference and the initializer expression

  • is an lvalue (but is not a bit-field), and "cv1 T1" is reference-compatible with "cv2 T2", or
  • [...]

then the reference is bound to the initializer expression lvalue [...]

This implies evaluation of a, because if it's not evaluated, then we wouldn't know what lvalue to bind the reference to. This is another example of how the fact that something even is evaluated generally has to be inferred because it is not stated as explicitly as one might hope. The result of evaluating a is given by [expr.prim.id.unqual]/2:

The result is the entity denoted by the identifier. [...] The type of the expression is the type of the result. [...] The expression is an lvalue if the entity is a function, variable, structured binding (9.6), data member, or template parameter object and a prvalue otherwise (7.2.1); it is a bit-field if the identifier designates a bit-field. [...]

That is, the result of the evaluation of the expression a is "lvalue designating the object named a".

In S(): ref(x), ref is not an expression, so it is not evaluated. The entire construct ref(x) is known as a mem-initializer and will be evaluated if the constructor is called; this is specified by [class.base.init]/13:

In a non-delegating constructor, initialization proceeds in the following order:

  • [...]
  • Then, non-static data members are initialized in the order they were declared in the class definition (again regardless of the order of the mem-initializers).
  • Finally, the compound-statement of the constructor body is executed.

Such initialization of non-static data members is done according to [class.base.init]/7:

The expression-list or braced-init-list in a mem-initializer is used to initialize the designated subobject (or, in the case of a delegating constructor, the complete class object) according to the initialization rules of 9.4 for direct-initialization.

That is, when the constructor is called, and before the outermost block of the constructor is entered, ref is initialized according to the mem-initializer. This initialization is direct-initialization with x as the initializer.

Finally, in the body of S's default constructor, based on the previously discussed considerations, the statement this->x = 10; will be evaluated if that constructor is called. It is an expression statement. [stmt.expr]/1 says:

[...] The expression is a discarded-value expression (7.2.3). [...]

The meaning of a discarded-value expression is given by [expr.context]/2:

[...] If the (possibly converted) expression is a prvalue, the temporary materialization conversion (7.3.5) is applied. [...] The glvalue expression is evaluated and its value is discarded.

The expression this->x = 10 is a glvalue, so it will be evaluated and its value discarded. Specifically, it is an assignment expression, and [expr.ass]/1 states that

In all cases, the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression. The right operand is sequenced before the left operand.

This states that the actual assignment occurs after both the left and right operands have been evaluated (the "value computation"). This implies that this->x is evaluated. It is a class member access expression, and [expr.ref]/1 states that "the postfix expression before the dot or arrow is evaluated". That expression is this, consequently, we conclude that this is evaluated.