﻿C++ criticism by other people

This page is a collection of the best C++ criticism by FQA readers, copied
from e-mail messages and online discussions. If you know an interesting consequence
of C++ problems not mentioned in the FQA, please send me e-mail. Similarly to
the FQA errors page, this one lists things
that can be proved / tested rather than qualitative statements. The stuff is published with
credits or anonymously, according to the choice of each author.

The issues listed here (or the FQA itself) are not supposed to be "new" in the sense that
they were never discussed in a published work
(if C++ problems were so hard to discover as to take decades, discussing them wouldn't necessarily be worth the trouble).

There's lots of (well-reasoned or entertaining or both)
C++ criticism on the web, including several pieces by celebrity programmers. However, I made
the decision not to cite famous quotes by celebrities on this site. The main reason
is that I don't want to make the FQA more convincing to the people who mostly value the credentials
of an author and ignore things like facts and reasoning. I want to work less both with C++
and with these people. So I'd rather have them use C++ than convince them to switch to something else.

Yossi: The FQA doesn't talk much about implicit type conversions, since the FAQ doesn't. The problem
is quite important though. It wouldn't be so bad if C++ detected run time errors
(as opposed to compiling the second example to code modifying the wrong place), and/or
if so many C++ programmers didn't think that "with C++, when it compiles and links, it will run correctly"
(I actually heard this one, and then there are many large C++ monolithic applications without unit tests
speaking for themselves).

Note that this has nothing to do with safety and C++ being a "power tool allowing you to do dangerous things"
because it's so "high-performance". This argument only makes sense for explicit casts. What the code
demonstrates is unexpected interactions between pairs of different implicit conversions
(in the first example, bool -> char* -> std::string, in the second - B[] -> B* -> A*).

By the way, the second example explains
the FAQ's remark about arrays being evil in the context of inheritance and substitutability.
The thing is that with arrays of objects, there's an implicit type conversion that allows you to violate
the substitutability principle without a compile time error. With std::vector<B>, there's no implicit
cast. I didn't understand the FAQ was talking about that, because most of the time, when inheritance and
polymorphism are involved, you allocate arrays of pointers to objects. And in that case, there's no
difference between a vector<B*> and a B** - you'd need an explicit cast in both cases. I automatically thought
the question was the continuation of the discussion in preceding questions
about why the compiler wouldn't do the cast implicitly, and in that context
the "arrays are evil" remark didn't quite fit. I completely forgot about the arrays-of-objects case,
which is something a newbie coming from another language
(and with C++, some people stay "newbies" for years) could very well try to use.

C++ grammar: the type name vs object name issue

drorz: In C/C++ you can not separate parsing into separate syntax and semantic passes.
No existing compiler does it in two separate passes.

In the example:

AA BB(CC);

The parse tree is different in the following cases:

When AA and CC are types, BB is a function prototype.

When AA is a type and CC is a variable, then BB is a variable/object initialization.

When AA is a variable, AA BB(CC) is illegal and its parse tree is entirely different from the first two cases.

You can not (more precisely, no one did it in a real C/C++ compiler) fix a wrong parse tree in semantic analysis pass.

Consider this example:

x * y(z);

in two different contexts:

int main() {
int x, y(int), z;
x * y(z);
}

and

int main() {
struct x { x(int) {} } *z;
x * y(z);
}

In the first case x * y(z) is expression, and in the second case it is a declaration of pointer y.
Parse trees for those cases are completely different.

Yossi: This is the first part of the problem making the C++ grammar undecidable. The second
part of the problem is that AA may really be Template<Params>::InnerDef, and figuring
out whether InnerDef is a type name or an object name is equivalent to solving the halting
problem, since templates may instantiate themselves recursively and in fact represent arbitrary
recursive functions. Maybe I'll expand on this one later. In particular, it has to do with
template specializations, which are discussed in the next item.

Purists who don't like the "nearly context-free" expression in Defective C++: when you write
parsers, it does make sense to discuss the "extent" of your dependence on the context.
For example, C++ inherits the type name/object name riddle from C. But in C, you can
solve it using a single dictionary of typedef names. Of course, theoretically the
important part is that the C grammar is decidable (though not context-free). In practice,
what matters is that it's easy to parse. In particular, you can use a parser generator for
context-free grammars (yacc/bison is one mature program in this family) with the
simple "symbol table hack" described above, and get a working parser. This is what
"nearly context-free" means.

I think the example is excellent since it took me quite some time to figure out what the code
means myself
(I think it's the asterisk). The AA BB(CC); example used in Defective C++ is simpler, but I think it doesn't convey
the point as clearly, since apparently it makes it intuitively easier to counter with something like "you can
solve the ambiguity at the semantical analysis stage". Note that you can always counter with that -
for example, you can say that the "parse tree" of your language is simply the list of characters
in the file, and the rest is semantical analysis.

If you "didn't care" about semantics during parsing, then confusing<1>::q is a typename,
so confusing<1>::q<3>(2) creates an object of type confusing<1>::q<3> with the argument 2.

If you "do" semantics during the syntax pass, then confusing<4> will be looked up,
confusing<4>::q is a variable. The declaration would "expand" to int x = (confusing<4>::q < 3) > 2.

You can see that parse trees in those cases are completely different, based on the output of the sizeof operator!

Yossi: ...and sizeof depends on the platform and the implementation details of inheritance
(including multiple and virtual), virtual functions, etc. The "parser" gets closer and closer
to a full-blown compiler.

The problem is the freedom that template specializations have when defining members. Now if anybody
showed me a useful application of the ability to define something as an inner type in one specialization
and a static variable in some other specialization, I'd be surprised.

The reddit thread
which this and the previous example are taken from has a detailed discussion about parsing C++.

printf, iostream and internationalization

Alexander E. Patrakov (patrakov at ums dot usu dot ru): The FQA
lists valid information for and against the use of <iostream> instead of <cstdio>.
There is,
however, one more thing for <cstdio> and against <iostream>: the
possibility to translate program messages to a different natural
language (using, e.g., gettext). And here I don't mean that there is
currently no gettext equivalent for C++ iostreams, but that there is no
way to design such thing correctly.

Translation works on phrases, not on their parts. Consider, e.g., such C
statements:

A well-designed program fetches translations from a message catalog,
Windows resource or anywhere else except its own source code. With C
and gettext message catalogs, the translator sees the whole phrases such as "Read %d files",
"New data were found
in %d files", etc. If the same approach were applied to C++, the
translator would see just "Read ", "New data were found in ", and "files" (used twice). Lack of context is the least of all worries. The
real problem is that, e.g., when translating to Russian, the two
instances of " files" have to be translated slightly differently, because
Russian has six grammatical cases and different cases are required
in the two sentences:

(approximately - I don't want to
overwhelm the example with the singular/plural treatment)

Even worse, examples exist with two format substitutions where they have
to be reordered when translating. C (or, more precisely, the Single UNIX
Specification) allows such reordering with something like printf("%2$d x
%1$d inches", width, num); but in C++ the output order of fields is
hard-coded.

The downside is, of course, that nobody except the translator checks
the translated format string, and wrongly-copied conversion specifiers
can crash a program in the corresponding locale (and this did happen
with sed and vim in the past).

See how Trolltech
handles the abovementioned problems in their Qt
toolkit.

Yossi: I really like this example because it can be a real eye-opener for a practical programmer,
and I wish I heard and thought about it several years ago. Clearly the printf interface gets
something right that iostream doesn't, since it seems to save us lots of trouble. What is it that printf
gets right? Could it be that representing the program structure using compile time constructs incomprehensible
to any tool except for the compiler is not the way to go? Effectively the advantage of the printf program
is that it's easier for other programs to manipulate. The idea that backfires is that program structure may be encoded
in abitrarily complex ways and the only one who ever has to worry about it is the compiler writer.

But maybe this translation business is a singularity in the computing universe, and we
shouldn't infer general conclusions from it? Well, here's another example. Suppose you want to do real time logging. You don't have
enough time and/or bandwidth to do the formatting at the target machine. And yet you want to log free
text, not some strict binary format with versioning schemes and fixed size limits and other headaches.
With printf-style interface, you can log packets of (for example) 32 bit words - size, constant format string
pointer, and the list of arguments. You can then extract the format strings from the executable file
(reading ELF or COFF files is easy - there are examples on the net of about 200 lines of C code),
and do the formatting at the host machine. Now, with iostream-like interface, the format string is split to many
little parts, and all kinds of types come in the middle - types of data items have to be encoded in the logged packets, too.
And you'd have to log calls to I/O manipulators such as hex, setfill, etc.
Clearly the overhead per logged data word is going to increase significantly.

Think about it: how can it be that a simplistic "1 format string plus N arguments of dynamic types" interface beats
an advanced "statically dispatched polymorphic operators" interface,
and what makes it surprising to you?

Static binding rules

Miguel Catalina: The following test program does not compile under gcc 4.3.{1,2}:

So it turns out that the operation that is giving us trouble is the &&
inside the __enable_if in the template declaration of pow(). We are
invoking && with two enum operands (__is_arithmetic<T>::__value is an
unnamed enum). I guess the compiler is treating unnamed enums as
ints. So the compiler is
trying to call operator&&(int,int). But there isn't,
there are only operator&&(my_class,int) and operator&&(bool,bool). So
the compiler is trying to do an implicit conversion of the operands so
that they can match the available prototypes. There are implicit ways
of converting an int to a my_class, as well as convering an int to a
bool. The compiler does not know which one to use, hence the
ambiguity.

The question is: why on Earth when you are trying to invoke a
function that only deals with doubles, do you have to deal with the
ambiguity between two available implicit conversions for types that
have nothing to do with double?

Yossi: Takes time to wrap one's mind around this, um, treason
(don't you just love the public __traitor bit? I guess a "traitor"
is something used to generate so-called "type traits", a key idiom
in the world of C++ templates arcana). Now that we (presumably) understand
the error message, how would you work around the problem? If the compiler
would barf trying to dispatch an operator with user-defined types,
we could specifically define the operator with the prototype it would
pick as the best match (as the GNU STL implementors themselves do
in similar situations). But we can't define operator&&(int,int). Now what?