(Note: This is meant to be an entry to Stack Overflow's C++ FAQ. If you want to critique the idea of providing an FAQ in this form, then the posting on meta that started all this would be the place to do that. Answers to that question are monitored in the C++ chatroom, where the FAQ idea started out in the first place, so your answer is very likely to get read by those who came up with the idea.)

@Prasoon: C++11 has been out for a while now, can you change the accepted answer to the C++0x version, rewrite its intro to say C++11, and add a link from there to the other answer with C++03 rules?
–
Ben VoigtMay 12 '14 at 1:24

What are Sequence Points?

The Standard says

At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations
shall be complete and no side effects of subsequent evaluations shall have taken place. (§1.9/7)

Side effects? What are side effects?

Evaluation of an expression produces something and if in addition there is a change in the state of the execution environment it is said that the expression (its evaluation) has some side effect(s).

For example:

int x = y++; //where y is also an int

In addition to the initialization operation the value of y gets changed due to the side effect of ++ operator.

So far so good. Moving on to sequence points. An alternation definition of seq-points given by the comp.lang.c author Steve Summit:

Sequence point is a point in time at which the dust has settled and all side effects which have been seen so far are guaranteed to be complete.

What are the common sequence points listed in the C++ Standard ?

Those are:

at the end of the evaluation of full expression (§1.9/16) (A full-expression is an expression that is not a subexpression of another expression.)1

Example :

int a = 5; // ; is a sequence point here

in the evaluation of each of the following expressions after the evaluation of the first expression(§1.9/18) 2

a && b (§5.14)

a || b (§5.15)

a ? b : c (§5.16)

a , b (§5.18) (in func(a,a++), is not a comma operator, it's merely a separator between the arguments a and a++. The behaviour is undefined in that case if a is considered to be a primitive type)

at a function call (whether or not the function is inline), after the evaluation of all function arguments (if any) which
takes place before execution of any expressions or statements in the function body (§1.9/17).

1 : Note : the evaluation of a full-expression can include the evaluation of subexpressions that are not lexically
part of the full-expression. For example, subexpressions involved in evaluating default argument expressions (8.3.6) are considered to be created in the expression that calls the function, not the expression that defines the default argument

2 : The operators indicated are the built-in operators, as described in clause 5. When one of these operators is overloaded (clause 13) in a valid context, thus designating a user-defined operator function, the expression designates a function invocation and the operands form an argument list, without an implied sequence point between them.

What is Undefined Behaviour?

The Standard defines Undefined Behaviour in Section §1.3.12 as

behaviour, such as might arise upon use of an erroneous program construct or erroneous data, for which this International Standard imposes no requirements 3.

Undefined behaviour may also be expected when this
International Standard omits the description of any explicit definition of behavior.

3 : permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or with-
out the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).

In short, undefined behaviour means anything can happen from daemons flying out of your nose to your girlfriend getting pregnant.

1) Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression.

What does it mean?

Informally it means that between two sequence points a variable must not be modified more than once.
In an expression statement, the next sequence point is usually at the terminating semicolon, and the previous sequence point is at the end of the previous statement. An expression may also contain intermediate sequence points.

From the above sentence the following expressions invoke Undefined Behaviour.

i++ * ++i; // i is modified more than once
i = ++i // same as above
++i = 2; // same as above
i = ++i +1 // same as above
++++++i; //parsed as (++(++(++i)))
i = (i,++i,++i); // Undefined Behaviour because there's no sequence point between `++i`(right most) and assignment to `i` (`i` gets modified more than once b/w two SP)

2) Furthermore, the prior value shall be accessed only to determine the value to be stored.

What does it mean? It means if an object is written to within a full expression, any and all accesses to it within the same expression must be directly involved in the computation of the value to be written.

For example in i = i +1 all the access of i (in L.H.S and in R.H.S) are directly involved in computation of the value to be written. So it is fine.

This rule effectively constrains legal expressions to those in which the accesses demonstrably precede the modification.

is disallowed because one of the accesses of i (the one in a[i]) has nothing to do with the value which ends up being stored in i (which happens over in i++), and so there's no good way to define--either for our understanding or the compiler's--whether the access should take place before or after the incremented value is stored. So the behaviour is undefined.

@Mike: AFAIK, there are no (legal) copies of the C++ Standard you could link to.
–
sbiNov 14 '10 at 9:19

5

Well, then you could have a link to the ISO's relevant order page. Anyway, thinking about it, the phrase "elementary knowledge of C++ Standard" seems a bit of a contradiction in terms, since if you're reading the standard, you're past the elementary level. Maybe we could list what things in the language you need a basic understanding of, like expression syntax, order of operations, and maybe operator overloading?
–
Mike DeSimoneNov 14 '10 at 16:00

22

I'm not sure quoting the standard is the best way to teach newbies
–
InverseNov 14 '10 at 18:28

5

@Adrian The first expression invokes an UB because there is no sequence point between the last ++i and the assignement to i. The second expression does not invoke UB because expression i does not change the value of i. In the second example the i++ is followed by a sequence point (,) before the assignment operator is called.
–
KolyunyaJul 1 '13 at 7:09

This is a follow up to my previous answer and contains C++0x related stuffs..

Pre-requisites : An elementary knowledge of Relations (Mathematics).

I heard someone saying that there are no Sequence Points in C++0x, is this true ?

Yes! This is very true.

Sequence Points have been replaced by the more clear Sequenced Before & Sequenced Afterrelations in C++0x.

But why? I loved sequence points. :(

The ISO C++ Committee members thought that Sequence Points stuffs were quite difficult to understand. So they decided to replace it with the above mentioned relations just for more clear wording and enhanced preciseness.

If you loved Sequence Points you will love Sequenced[Before/After] relation even more. So there's nothing to worry about.

What exactly is this `Sequenced before` thing BTW?

between evaluations executed by a single thread and induces a strict partial order1

Formally it means given any two evaluations(See below)A and B, if A is sequenced beforeB, then the execution of Ashall precede the execution of B. If A is not sequenced before B and B is not sequenced before A, then A and B are unsequenced2.

Evaluations A and B are indeterminately sequenced when either A is sequenced before B or B is sequenced before A, but it is unspecified which3.

[NOTES]
1 : A strict partial order is a binary relation"<" over a set P which is asymmetric, and transitive, i.e., for all a, b, and c in P, we have that: ........(i). if a < b then ¬ (b < a) (asymmetry);
........(ii). if a < b and b < c then a < c (transitivity).
2 : The execution of unsequenced evaluations can overlap.
3 : Indeterminately sequenced evaluations cannot overlap, but either could be executed first.

What is the meaning of the word `evaluation` in context of C++0x ?

In C++0x evaluation of an expression (or a sub-expression) in general includes:

Every value computation and side effect associated with a full-expression is sequenced before every value computation and side effect associated with the next full-expression to be evaluated.

Example(trivial) :

int x;x = 10;++x;
Value computation and side effect associated with ++x is sequenced after the value computation and side effect of x = 10;

So there must be some relation between Undefined Behaviour and the above mentioned things, right?

Yes! Right.

In (§1.9/15) it has been mentioned that

Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced4.

For example :

int main()
{
int num = 19 ;
num = (num << 3) + (num >> 3) ;
}

1) Evaluation of operands of + operator are unsequenced relative to each other.
2) Evaluation of operands of << and >> operators are unsequenced relative to each other.

4: In an expression that is evaluated more than once during the execution
of a program, unsequenced and indeterminately sequenced evaluations of its subexpressions need not be performed consistently in different evaluations

(§1.9/15)
The value computations of the operands of an
operator are sequenced before the value computation of the result of the operator.

That means in x + y the value computation of x and y are sequenced before the value computation of (x + y).

More importantly

(§1.9/15) If a side effect on a scalar object is unsequenced relative to either

When calling a function (whether or not the function is inline), every value computation and side effect associated with any argument expression, or with the postfix expression designating the called function, is sequenced before execution of every expression or statement in the body of the called function. [ Note: Value computations and side effects associated with different argument expressions are unsequenced. — end note ]

Expressions (5) and (6) do not invoke undefined behaviour. Check out the following answers for a more detailed explanation.

Instead of "asymmetric", sequenced before / after are "antisymmetric" relations. This should be changed in the text to conform to the definition of a partial order given later (which also agrees with Wikipedia).
–
TemplateRexJul 21 '12 at 21:47

1

Why is 7) item in the last example an UB? Maybe it should be f(i = -1, i = 1)?
–
MikhailMar 18 '14 at 12:04

I fixed the description of the "sequenced before" relation. It is a strict partial order. Obviously, an expression cannot be sequenced before itself, so the relation cannot be reflexive. Hence it is asymmetric not anti-symmetric.
–
ThomasMcLeodJun 24 '14 at 20:52

5) being well befined blew my mind off. the explanation by Johannes Schaub wasn't entirely straightforward to get. Especially because I believed that even in ++i (being value evaluated before the + operator that is using it), the standard still doesn't say that its side effect must be finished. But in fact, because it returns an ref to a lvalue which is i itself, it MUST have finished the side effect since the evaluation must be finished, therefore the value must be up to date. This was the crazy part to get in fact.
–
v.oddouJan 9 at 6:31

Sequence points

Sequence points are points in an execution of a program where all side effects produced by evaluations prior to the sequence points have been completed. Side effects produced by evaluations that occur after the sequence point will therefor be separated from side effects produced by evaluations that occur before the sequence point and happen afterwards.

Evaluations

Evaluating something means to apply some runtime semantics on an expression. There are unevaluated expressions (operands of sizeof, some operands of typeid and such) that only inspect the expression's type and don't have meaning at runtime. If an expression is evaluated, it can result in a value which may imply reading values out of objects, or it may just evaluate to an object without reading the value of it (it then remains an lvalue, as with the left subexpression of an assignment). In addition, it can produce side effects as necessary. An evaluation is complete if its value is known, but until a sequence point has been reached, side effects produced by the evaluation are assumed to be still processed.

You have sequence points after all evaluations that usually are needed to be processed completely before some certain other expressions are processed. These are

After evaluation of a in a && b and a || b and a ? b : c. Also after evaluation of a in a, b - this operator is called the "comma operator".

For a function call, after evaluating the function call arguments and before starting evaluations in the function body.

After the evaluation of a complete expression (one that wasn't evaluated as part of another expression). Examples are loop conditions, if conditions, switch values and expression statements.

Immediately before a function terminates (by unwinding the function by an exception or by ordinarily returning it after (possibly) creating the return value). This makes sure that every side effect in a function really has been settled and is completely processed.

Side effects

A side effect is a change in the execution environment of the program that happens in addition to simply computing a value. This can be (among others) writing to an object, calling an input/output function or calling a function that does so.

Flow of program execution

With these three terms, the flow of a program can be visualized as follows. In the following diagrams, an E(X) specifies the evaluation of a (sub-)expression x, an % specifies a sequence point and an S(k, e) specifies a side effect k on an object e. If an evaluation needs to read a value from a named object (if x is a name), the evaluation is written as V(x), otherwise it's written as E(x). Side effects are written right and left to the expressions. An edge between two expressions means that the upper expression is evaluated before the lower expression (usually because the lower expression depends on the value or lvalue of the upper expression).

If you look at the two expression statements i++; i++;, you can depict the following diagram

E(i++) -> { S(increment, i) }
|
%
|
E(i++) -> { S(increment, i) }
|
%

As can be seen, there are two sequence points, and one of them separates the two modifications of i. Function call arguments are interesting too, although I will omit the diagram for this

Wow, where do the two branches come from? Remember from the initial definition of sequence point: Sequence points affect evaluations that occur prior to it. All subexpressions of the multiplication are evaluated prior to it and there is no other sequence point, so we must assume "maximal parallelity" to find where potentially we have concurring writes to the same object. More formally, the two branches are not ordered. The sequence point relation is this a relation that orders some evaluations to each other and doesn't order others: It's therefor a partial order.

Conflicting side effects

To give the compiler maximal freedom in generating and optimizing machine code, cases like the multiplication above don't sequence the evaluations of subexpressions and don't separate the side effects produced by them except in the few cases outlined above. This can lead to conflicts, and the C++ Standard marks behavior of programs undefined if they try to modify the same object without an intervening sequence point (really, it applies to scalar objects, because other objects are either non-modifiable (arrays) or just aren't applicable to this rule (class objects)). Behavior is also undefined if a previous value is read from the object but there is a modification too, as in i * i++

As we see here, the value of i is read on the right side and after the evaluation of both sides the assignment takes place. So we have a side effect and the read of i's value without an intervening sequence point, but the read was only to determine the value to be stored into i, so it is fine.

Sometimes, a value is read after a modification was done. This is the case for a = (b = 0), which in C++ will write to b and then read from b, without an intervening sequence point! This however is fine, because it does not read the previous value of b, but the new value of it. In this case, the side effect of the assignment to b has been complete not only before the next sequence point, but also before the read of b, as needed for the assignment to a to get the new value from b. In the spec, this relation is established by explicit constraints, in this case it appertains in particular to b = 0 and reads "The result of the assignment operation is the value stored in the left operand after the assignment has taken place; the result is an lvalue." Why not a sequence point to make this relation? Because a sequence point would have the undesirable effect of requiring every side effect that happens in the evaluation of the left and right operand to be complete, instead of doing so only for the assignment in case its resulting lvalue is read from.

Closing words

It should be noted that temporaries created in the evaluation of a full-expression are usually not cleaned up before the very next sequence point but only when the full-expression has been completely evaluated (in certain situations, the lifetime of temporaries will instead be even longer if there were references bound to them).

By my understanding, if 'b' is not volatile, the statement a=(b=c); could legitimately cast c to b's type, and then store that to both b and a without re-reading b, but if b is volatile, the compiler must re-read b since it might have changed (and since there may be situations, especially if b is something like a hardware register, where there would be no way for it not to have changed). Is my understanding correct?
–
supercatNov 20 '11 at 20:10

@supercat sure. as long as b and a are independent variables that would be fine. but if a and c are int references referring the same object for example, the compiler must be careful to do the write to a after the write to b. if one cannot figure out what the compiler did, the compiler can do anything of course.
–
Johannes Schaub - litbNov 21 '11 at 5:42

I am guessing there is a fundamental reason for the change, it isn't merely cosmetic to make the old interpretation clearer: that reason is concurrency. Unspecified order of elaboration is merely selection of one of several possible serial orderings, this is quite different to before and after orderings, because if there is no specified ordering, concurrent evaluation is possible: not so with the old rules. For example in:

f (a,b)

previously either a then b, or, b then a. Now, a and b can be evaluated with instructions interleaved or even on different cores.

I believe, though, that if either 'a' or 'b' includes a function call, they are indeterminately sequenced rather than unsequenced, which is to say that all side-effects from one are required to occur before any side-effects from the other, although the compiler need not be consistent about which one goes first. If that is no longer true, it would break a lot of code which relies upon the operations not overlapping (e.g. if 'a' and 'b' each set up, use, and take down, a shared static state).
–
supercatDec 9 '10 at 20:35