This is indeed an excellent example of what I was referring to in my last
post. It also raises another interesting issue.

If you have a machine with multiple functional units, or pipelined
multiplies, the BAD form here may run faster than the GOOD form. The
problem is that the "GOOD" form makes you do all the adds/multiplies
sequentially. With the "BAD" form, they can be overlapped.

Now the interesting issue. It is well known that converting from the "BAD"
form to the "GOOD" form above may be numerically unstable. In general, for
any polynomial, you want to add the smallest terms together first and then
work through to the bigger terms. Of course, the "BAD" form does not
neccessarily do this either. If your language guarantees left to right
evaluation and x is small, then the "BAD" form above should have been
written the other way around!

The reorganisation that one wants to do to schedule all the operations can
also cause numerical instability.

So, we have a comprimise between accuracy and speed. Some languages
address this by saying that an optimizer must respect parenthesis so that
if you wrote

P = c(1) + (c(2)*x + (c(3)*x**2 + (c(4)*x**3 + (c(5)*x**4))))

optimization would be inhibited.

Compiler writers are always faced with the problem that users are going to
run real code through their compilers and compare the output with that
from other machines. It is real hard to explain to a user that he gets the
wrong answers from your compiler because its better than his old compiler,
or that if he wrote his code properly he would not be having the problem
in the first place!

The problems that this causes makes you wish that the hardware people
would come up with an adder that can take a vector and add all the values
together in a numerically stable order. I wonder if anyone has ever
designed such a thing?
--