If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

I'd suggest -O2, which for GCC and ICC is the most commonly used. As for GCC arch flags, I guess what distros use is a good idea? That would be "-march=i686 -mtune=generic" for 32bit. Not sure what is used for 64bit though.

I'd suggest -O2, which for GCC and ICC is the most commonly used. As for GCC arch flags, I guess what distros use is a good idea? That would be "-march=i686 -mtune=generic" for 32bit. Not sure what is used for 64bit though.

Thanks for your imput. I will try to see which flags that may be most relevant. I might have to set different ones for different compilers too I guess (need to read some manpages).

Feel free to suggest improvements. The more measurement points the clearer the picture (hopefully).

updated with optimizations -GCC

Not too much did change when optimizations were used (I suppose there are default settings to start with). Surprisingly, -O2 often performed better than -O3. I am considering trying some LTO later.

Next up will however be similar analyses of optimization levels for ICC, Clang and other compilers that have such options.
After that I think I will move on to 32-bit benchmarks, where there are a number of other interesing compilers to test...

@staalmannen
Since you'll be running more tests, could you add -Os (optimise for size) to the gcc optimisation options tested? Also for icc, clang and pcc if they have a similar option.

It is well known that -O3 leads to better performance than -O2 only in very specific cases. The reason is partly because -O3 binaries are larger and that makes me suspect that -Os should perform better than -O2 in some cases.

It is well known that -O3 leads to better performance than -O2 only in very specific cases. The reason is partly because -O3 binaries are larger and that makes me suspect that -Os should perform better than -O2 in some cases.

It's true and some plp have already measured this. As -Os produce small excutables your CPU not waste much time moving data around cache, and in some cases this performs better than -O2 and -O3 optimizations, this is even more important on CPUs with small caches. Some kernels devs recomend -Os flag to compile the kernel.

Sure I will try that after I have tried -O2 and -O3 for Clang and Open64, along with the Os-tests for the 4 compilers supporting it (ICC, GCC, Clang, Open64).

If anyone knows what flags are recommended for tcc and pcc I am all ears.

In addition, if anyone knows how to "unclutter" a big result file on phoronix global -that would be appreciated.

I still want all data in one graph since that actually gives additional value (comparisons between compilers X different optimization levels).

One pattern that seems to be emerging, for example, is that compile time is not inversely related to optimized final binaries (which often is assumed in interpretations of compiler comparisons).
Unfortunately binary size is not part of the current compiler benchmark suite. It would have been nice if the suite stored binary sizes for each compilation...