Toughest bug/war stories

This is a discussion on Toughest bug/war stories within the Tech Board forums, part of the Community Boards category; Yup. Brewbuck got it. It's a particularly pernicious case of undefined behavior because not many of us think to check ...

Yup. Brewbuck got it. It's a particularly pernicious case of undefined behavior because not many of us think to check the order of evaluation on assignment operations, taking it for granted (and falling flat on our faces) that the right hand side is evaluated first.

Edit: It's the postfix operator that doesn't offer you that guarantee, abachler. Not the assignment operator.

Here:

ISO-IEC 14882-2003(E) - 5 Expressions, 4.

Except where noted, the order of evaluation of operands of individual operators and subexpressions of individual
expressions, and the order in which side effects take place, is unspecified.53) Between the previous
and next sequence point a scalar object shall have its stored value modified at most once by the evaluation
of an expression. Furthermore, the prior value shall be accessed only to determine the value to be stored.
The requirements of this paragraph shall be met for each allowable ordering of the subexpressions of a full
expression; otherwise the behavior is undefined.
[Example:i = v[i++]; // the behavior is unspecified
i = 7, i++, i++; // i becomes 9
i = ++i + 1; // the behavior is unspecified
i = i + 1; // the value of i is incremented

No, that part is not undefined. the value of the index must be calculated before the assignment, as the assignment depends on the data at the memory location, hence the pointer must be resolved prior to assignment....

Blah blah blah. What you say makes sense, but not that much. Why does the value of the index have to be decided first? There is nothing to demand that the left side must be evaluated before the right.

That is a single expression. We've seen a few of these here, and they always end with someone demonstrating their compiler did it in an unpredicted way. Even if it seems silly, it's still undefined.

Blah blah blah. What you say makes sense, but not that much. Why does the value of the index have to be decided first? There is nothing to demand that the left side must be evaluated before the right.

It's a dependency issue. the value of foo[x] cannot be evaluated prior to knowing the value of x. if x is an expression it must therefore be evaluated BEFORE you can find the value of foo[x] (duh!!!!) as you cannot index into an array without knowing the value of the index. Since assignment cannot take place until you have the value to assign, evaluation of x must occur before assignment. Because the standard specifically states that i++ evaluates to the value of i prior to incrementing, foo[i++] therefore evaluates to foo[i]. foo[i++] + i is undefined because the the standard does not require foo[i++] to be evaluated in any particular priority over i, so long as proper order of operations is maintained.

foo[i] = foo[i++] will always evaluate to foo[i] = foo[i]. The left side is always calculated first in any standard compliant compiler specifically because expressions on the right side may alter the index value in this way.

Until you can build a working general purpose reprogrammable computer out of basic components from radio shack, you are not fit to call yourself a programmer in my presence. This is cwhizard, signing off.

No, that part is not undefined. the value of the index must be calculated before the assignment, as the assignment depends on the data at the memory location, hence the pointer must be resolved prior to assignment. The index into the array must be calculated before the pointer can be resolved, and i++ specifically uses the value prior to incrementation as the evaluated value, hence by the standard it must resolve to foo[i]. If it produces variant behavior then the implementation is non-compliant.

The C standard states that:

Originally Posted by C99 Section 6.5 Paragraph 2

Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be read only to determine the value to be stored.

As an example of undefined behaviour, the C99 text provides:

Code:

a[i++] = i;

The C++ standard states that:

Originally Posted by C++03 Section 5 Paragraph 4 (part)

Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be accessed only to determine the value to be stored.

The C++03 text provides an example:

Code:

i = v[i++]; // the behavior is unspecified

Now, the problem with this is that it violates the rule that "the prior value shall be read/accessed only to determine the value to be stored":

Code:

foo[i] = foo[i++];

I think that the problem with your reasoning is that you assume that the effects of the increment must take place after foo[i] and foo[i++] are evaluated since "the assignment depends on the data at the memory location", but actually it can take place at any point between the previous and next sequence points, e.g., foo[i++] could be evaluated, and immediately thereafter the effects of the increment take effect, upon which foo[i] on the left hand side is evaluated, but now with i having the incremented value.

EDIT:

Originally Posted by Mario F.

Not really. Not that easy.
Check the 2nd example above. But generally? Yes. The 2nd example is also ugly code.

The reason is that the comma operator introduces a sequence point.

Originally Posted by abachler

The left side is always calculated first in any standard compliant compiler specifically because expressions on the right side may alter the index value in this way.

That is false (other than as an observation of what happens in existing standard compliant compilers). The copy assignment operator does not introduce a sequence point, and except for special cases the order of evaluation is otherwise unspecified.

No, that part is not undefined. the value of the index must be calculated before the assignment, as the assignment depends on the data at the memory location, hence the pointer must be resolved prior to assignment. The index into the array must be calculated before the pointer can be resolved, and i++ specifically uses the value prior to incrementation as the evaluated value, hence by the standard it must resolve to foo[i]. If it produces variant behavior then the implementation is non-compliant.

What would be undefined however is foo[i] = foo[i++] + i, as there is no guarantee in which order the variables foo[i++] and i are evaluated, and hence it may add either i or i+1.

Are we sure about that?
Don't know exactly the rules, but can't the compiler resolve foo[i] = foo[i++] like

lets say i=0. The sure thing is that the right value is foo[0] and i=1 in the end. But when is the left value evaluated (the location the right value is written at). If the compiler reads first the right value then it might do option #1. If it reads the left value first it might do option #2. It might always leave the increment for the end of the whole line, but is that for sure?

The compiler can and will resolve it. Undefined behavior does not mean a compiler can't do something. Undefined behavior is a feature of the programming language rules, not of the compiler. Compilers are profoundly deterministic and do not display undefined behavior. This does not mean however you will know how exactly the compiler will implement it, unless you read the resulting assembly.

In short, any rule that has undefined behavior means the compilers are free to implement it as they see fit. And this can even mean the same compiler may (and I believe they usually do) implement the same rule differently, on different contexts depending on its optimization concerns.

In short, any rule that has undefined behavior means the compilers are free to implement it as they see fit. And this can even mean the same compiler may (and I believe they usually do) implement the same rule differently, on different contexts depending on its optimization concerns.

So you are saying it would be almost foolish to do something that you know depends on an undefined behavior being resolved in a clearly defined way?

The compiler can and will resolve it. Undefined behavior does not mean a compiler can't do something. Undefined behavior is a feature of the programming language rules, not of the compiler. Compilers are profoundly deterministic and do not display undefined behavior. This does not mean however you will know how exactly the compiler will implement it, unless you read the resulting assembly.

Strictly, the behavior is unspecified, not undefined. Unspecified means "Something predictable will happen, but there are multiple possibilities and we do not specify which, and it is dependent on the compiler." Undefined means "Anything and everything could happen, including pink elephants emerging from your ass."

For instance, the representation of signed integers is unspecified, but not undefined. If it was undefined then signed integers could not exist. It's just up to the implementation.

Or, if you look at it from the perspective of the person writing the compiler, when you see "unspecified" it means "up to you," whereas "undefined" means "don't even worry about this case."

Undefined behavior is always clearly defined by the implementation (the combination of compiler and CPU). The same code will always compile to the exact same machine code. There's nothing undefinable about compilers.

>> almost foolish

No. Entirely foolish. Because there's only one guarantee the code will behave as we expect: Never change the code, always compile it using the same compiler, with the same version, to run on the same CPU, and with the same compiler flags. Since real-life code very rarely can agree with these limitations, it would be a mistake to inspect the resulting assembly to determine how undefined behavior is being treated by the compiler and decide or not how to write the code from that. A new compiler version, the need to add/change code or any other alteration could mean different behavior on how the compiler would treat that undefined behavior.

>> So you are saying

No. I wasn't saying anything like that.

Instead, you can expect a compiler to behave predictably. That's what they do best. What this does not mean however is that one can (or has the ability for that matter) to predict how the compiler will behave.

Undefined behavior is always clearly defined by the implementation (the combination of compiler and CPU). The same code will always compile to the exact same machine code. There's nothing undefinable about compilers.

I don't think that is true. You can update the compiler to a different version, or change the optimization flags and get completely different behavior. The issue here is not "undefined assembly code", it's "undefined behavior".

Until you can build a working general purpose reprogrammable computer out of basic components from radio shack, you are not fit to call yourself a programmer in my presence. This is cwhizard, signing off.