The terms 'operator precedence' and 'order of evaluation' are very commonly used terms in programming and extremely important for a programmer to know. And, as far as I understand them, the two concepts are tightly bound; one cannot do without the other when talking about expressions.
Let us take a simple example:
int a=1; // Line 1
a = a++ + ++a; // Line 2
printf("%d",a); // Line 3
Now, it is evident that Line 2
leads to Undefined Behavior, since Sequence points in C and C++ include:
Between evaluation of the left and right operands of the && (logical
AND), || (logical OR), and comma
operators. For example, in the
expression *p++ != 0 && *q++ != 0
, all
side effects of the sub-expression
*p++ != 0
are completed before any attempt to access q
.
Between the evaluation of the first operand of the ternary
"question-mark" operator and the
second or third operand. For example,
in the expression a = (*p++) ? (*p++)
: 0
there is a sequence point after
the first *p++
, meaning it has already
been incremented by the time the
second instance is executed.
At the end of a full expression. This category includes expression
statements (such as the assignment
a=b;
), return statements, the
controlling expressions of if, switch,
while, or do-while statements, and all
three expressions in a for statement.
Before a function is entered in a function call. The order in which
the arguments are evaluated is not
specified, but this sequence point
means that all of their side effects
are complete before the function is
entered. In the expression f(i++)?+ g(j++) + h(k++)
,
f
is called with a
parameter of the original value of i
,
but i
is incremented before entering
the body of f
. Similarly, j
and k
are
updated before entering g
and h
respectively. However, it is not
specified in which order f()
, g()
, h()
are executed, nor in which order i
, j
,
k
are incremented. The values of j
and
k
in the body of f
are therefore
undefined.3 Note that a function
call f(a,b,c)
is not a use of the
comma operator and the order of
evaluation for a
, b
, and c
is
unspecified.
At a function return, after the return value is copied into the
calling context. (This sequence point
is only specified in the C++ standard;
it is present only implicitly in
C.)
At the end of an initializer; for example, after the evaluation of 5
in the declaration int a = 5;
.
Thus, going by Point # 3:
At the end of a full expression. This category includes expression statements (such as the assignment a=b;), return statements, the controlling expressions of if, switch, while, or do-while statements, and all three expressions in a for statement.
Line 2
clearly leads to Undefined Behavior. This shows how Undefined Behaviour is tightly coupled with Sequence Points.
Now let us take another example:
int x=10,y=1,z=2; // Line 4
int result = x<y<z; // Line 5
Now its evident that Line 5
will make the variable result
store 1
.
Now the expression x<y<z
in Line 5
can be evaluated as either:
x<(y<z)
or (x<y)<z
. In the first case the value of result
will be 0
and in the second case result
will be 1
. But we know, when the Operator Precedence
is Equal/Same
- Associativity
comes into play, hence, is evaluated as (x<y)<z
.
This is what is said in this MSDN Article:
The precedence and associativity of C operators affect the grouping and evaluation of operands in expressions. An operator's precedence is meaningful only if other operators with higher or lower precedence are present. Expressions with higher-precedence operators are evaluated first. Precedence can also be described by the word "binding." Operators with a higher precedence are said to have tighter binding.
Now, about the above article:
It mentions "Expressions with higher-precedence operators are evaluated first."
It may sound incorrect. But, I think the article is not saying something wrong if we consider that ()
is also an operator x<y<z
is same as (x<y)<z
. My reasoning is if associativity does not come into play, then the complete expressions evaluation would become ambiguous since <
is not a Sequence Point.
Also, another link I found says this on Operator Precedence and Associativity:
This page lists C operators in order of precedence (highest to lowest). Their associativity indicates in what order operators of equal precedence in an expression are applied.
So taking, the second example of int result=x<y<z
, we can see here that there are in all 3 expressions, x
, y
and z
, since, the simplest form of an expression consists of a single literal constant or object. Hence the result of the expressions x
, y
, z
would be there rvalues, i.e., 10
, 1
and 2
respectively. Hence, now we may interpret x<y<z
as 10<1<2
.
Now, doesn't Associativity come into play since now we have 2 expressions to be evaluated, either 10<1
or 1<2
and since the precedence of operator is same, they are evaluated from left to right?
Taking this last example as my argument:
int myval = ( printf("Operator
"), printf("Precedence
"), printf("vs
"),
printf("Order of Evaluation
") );
Now in the above example, since the comma
operator has same precedence, the expressions are evaluated left-to-right
and the return value of the last printf()
is stored in myval
.
In SO/IEC 9899:201x under J.1 Unspecified behavior it mentions:
The order in which subexpressions are evaluated and the order in which side effects
take place, except as specified for the function-call (), &&, ||, ?:, and comma
operators (6.5).
Now I would like to know, would it be wrong to say:
Order of Evaluation depends on the precedence of operators, leaving cases of Unspecified Behavior.
I would like to be corrected if any mistakes were made in something I said in my question.
The reason I posted this question is because of the confusion created in my mind by the MSDN Article. Is it in Error or not?
Question&Answers:
os