Who defines operator precedence and associativity, and how does it relate to order of evaluation?

21.8k views Asked by At

Introduction

In every textbook on C/C++, you'll find an operator precedence and associativity table such as the following:

Operator Precedence And Associativity Table

http://en.cppreference.com/w/cpp/language/operator_precedence

One of the questions on StackOverflow asked something like this:

In what order do the following functions execute:

f1() * f2() + f3();
f1() + f2() * f3();

Referring to the previous chart, I confidently replied that functions have left-to-right associativity so in the previous statements the are evaluated like this in both cases:

f1() -> f2() -> f3()

After the functions are evaluated you finish the evaluation like this:

(a1 * a2) + a3
a1 + (a2 * a3)

To my surprise, many people told me I was flat out wrong. Determined to prove them wrong, I decided to turn to the ANSI C11 standard. I was once again surprised to find out that very little is mentioned on operator precedence and associativity.

Questions

  1. If my belief that functions are always evaluated from left-to-right is wrong, what does the table referring to function precedence and associativity really mean?

  2. Who defines operator precedence and associativity if it's not ANSI? If it is ANSI who makes the definition, why is little mentioned about operator precedence and associativity? Is operator precedence and associativity inferred from the ANSI C standard or is it defined in Mathematics?

6

There are 6 answers

5
Joseph Mansfield On BEST ANSWER

Operator precedence is defined in the appropriate standard. The standards for C and C++ are the One True Definition of what exactly C and C++ are. So if you look closely, the details are there. In fact, the details are in the grammar of the language. For example, take a look at the grammar production rule for + and - in C++ (collectively, additive-expressions):

additive-expression:
  multiplicative-expression
  additive-expression + multiplicative-expression
  additive-expression - multiplicative-expression

As you can see, a multiplicative-expression is a subrule of an additive-expression. This means that if you have something like x + y * z, the y * z expression is a subexpression of x + y * z. This defines the precedence between these two operators.

We can also see that the left operand of an additive-expression expands to another additive-expression, which means that with x + y + z, x + y is a subexpression of it. This defines the associativity.

Associativity determines how adjacent uses of the same operator will be grouped. For example, + is left-to-right associative, which means that x + y + z will be grouped like so: (x + y) + z.

Don't mistake this for order of evaluation. There is absolutely no reason why the value of z could not be computed before x + y is. What matters is that it is x + y that is computed and not y + z.

For the function call operator, left-to-right associativity means that f()() (which could happen if f returned a function pointer, for example) is grouped like so: (f())() (of course, the other direction wouldn't make any sense).

Now let's consider the example you were looking at:

f1() + f2() * f3()

The * operator has higher precedence than the + operator, so the expressions are grouped like so:

f1() + (f2() * f3())

We don't even have to consider associativity here, because we don't have any of the same operator adjacent to each other.

Evaluation of the functions call expressions is, however, completely unsequenced. There's no reason f3 couldn't be called first, then f1, and then f2. The only requirement in this case is that operands of an operator are evaluated before the operator is. So that would mean f2 and f3 have to be called before the * is evaluated and the * must be evaluated and f1 must be called before the + is evaluated.

Some operators do, however, impose a sequencing on the evaluation of their operands. For example, in x || y, x is always evaluated before y. This allows for short-circuiting, where y does not need to be evaluated if x is known already to be true.

The order of evaluation was previously defined in C and C++ with the use of sequence points, and both have changed terminology to define things in terms of a sequenced before relationship. For more information, see Undefined Behaviour and Sequence Points.

1
ouah On

The precedence of operators in the C Standard is indicated by the syntax.

(C99, 6.5p3) "The grouping of operators and operands is indicated by the syntax. 74)"

74) "The syntax specifies the precedence of operators in the evaluation of an expression"

C99 Rationale also says

"The rules of precedence are encoded into the syntactic rules for each operator."

and

"The rules of associativity are similarly encoded into the syntactic rules."

Also note that associativity has nothing to do with evaluation order. In:

f1() * f2() + f3()

function calls are evaluated in any order. The C syntactic rules says that f1() * f2() + f3() means (f1() * f2()) + f3() but the evaluation order of the operands in the expression is unspecified.

4
AudioBubble On

Left-to-right associativity means that f() - g() - h() means (f() - g()) - h(), nothing more. Suppose f returns 1. Suppose g returns 2. Suppose h returns 3. Left-to-right associativity means the result is (1 - 2) - 3, or -4: a compiler is still permitted to first call g and h, that has nothing to do with associativity, but it is not allowed to give a result of 1 - (2 - 3), which would be something completely different.

0
Idan Arye On

Precedence and associativity are defined in the standard, and they decide how to build the syntax tree. Precedence works by operator type(1+2*3 is 1+(2*3) and not (1+2)*3) and associativity works by operator position(1+2+3 is (1+2)+3 and not 1+(2+3)).

Order of evaluation is different - it does not define how to build the syntax tree - it defines how to evaluate the nodes of operators in the syntax tree. Order of evaluation is defined not to be defined - you can never rely on it because compilers are free to choose any order they see fit. This is done so compilers could try to optimize the code. The idea is that programmers write code that shouldn't be affected by order of evaluation, and yield the same results no matter the order.

2
Barmar On

One way to think about precedence and associativity is to imagine that the language only allows statements containing an assignment and one operator, rather than multiple operators. So a statement like:

a = f1() * f2() + f3();

would not be allowed, since it has 5 operators: 3 function calls, multiplication, and addition. In this simplified language, you would have to assign everything to temporaries and then combine them:

temp1 = f1();
temp2 = f2();
temp3 = temp1 * temp2;
temp4 = f3();
a = temp3 + temp4;

Associativity and precedence specify that the last two statements must be performed in that order, since multiplication has higher precedence than addition. But it doesn't specify the relative order of the first 3 statements; it would be just as valid to do:

temp4 = f3();
temp2 = f2();
temp1 = f1();
temp3 = temp1 * temp2;
a = temp3 + temp4;

sftrabbit gave an example where associativity of function call operators is relevant:

a = f()();

When simplifying it as above, this becomes:

temp = f();
a = temp();
0
Jan Schultke On

Firstly, you're conflating operator precedence/associativity and order of evaluation. These are two different concepts.

  • Operator precedence tells you which operations take place between what operands.
  • Order of evaluation tells you in what order these operands are evaluated, prior to the operator.

Overall, operator precedence tells us that the following operations take place:

x1 = f1()
x2 = f2()
x3 = f3()
xm = x2 * x3
xa = x1 + xm

The only ordering requirement (due to dependencies) is that x2 and x3 must be computed before xm, and x1 and xm must be computed before xa. Other than that, the compiler is free to do things in any order it wants.

Operator precedence is governed by the language grammar

Operator precedence is a consequence of the grammar of the language. For example, both C and C++ languages have the rule:

additive-expression:
        multiplicative-expression
        additive-expression + multiplicative-expression
        additive-expression - multiplicative-expression

According to this rule, the expression f1() + f2() * f3() is parsed as:

                  additive-expression
                          │
           ┌──────────────┼──────────────┐
           │              │              │
   additive-expression   '+'   multiplicative-expression
           │                   ┌─────────┼─────────┐
 multiplicative-expression     │         │         │
           │                  ...       '*'       ...
          ...                  │                   │
           │            postfix-expression  postfix-expression
    postfix-expression         │                   │
           │                  f2()                f3()
          f1()

Note: The ... indicates that there are many rules which are expanded until arriving at postfix-expression.

In other words, the operands are grouped like:

( f1() + ( f2() * f3() ) )

Precedence, arity, and associativity are all consequences of grammatical rules:

Order of evaluation is governed by sequencing rules

As already stated, order of evaluation is separate from precedence. Precedence can only tell you that a multiplication f2() * f3() takes place, but it does not tell you whether f2() or f3() are executed first.

[intro.execution] p10 states:

Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced.

Note that [intro.execution] p11 also states that function calls cannot be interleaved, so functions behave as if they were indeterminately sequenced, not unsequenced. In other words, f2() and f3() cannot be executed in parallel/interleaved, but it is unspecified whether f2() or f3() is executed first.

Some operators do have sequencing; for example, in f4() << f5(), f4() is always executed before f5(), because of the sequencing in [expr.shift] p4:

[For E1 << E2,] the expression E1 is sequenced before the expression E2.