If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

I believe one of the reason the benchmark numbers are totally bogus is that the compilation are done on ARM hosts.
Given the benchmarks are apparently compiled without -mcpu=cortex-a9, I suspect LLVM ended up generating code for "generic" ARMv4 cpu.
This article makes me sick in my stomach.Thanks,Evan"

"Michael Larabel on June 11, 2012 The bench marking was still being done from a PandaBoard ES with Texas Instruments OMAP4460 dual-core ARM Cortex-A9 development board. Via the CFLAGS/CXXFLAGS, -march=armv7-a was passed to each compiler. "

On the other hand once you sort out your flags war and reach consensus it might be interesting to see this test run on a

Calxeda quad-core ARM Cortex-A9 processor optimized for using in Servers over 10Gigabit/s internal fabric on each card

sample box with 2 or more cards installed for 32 Cortex A9 cores/8 SOC and greater etc and you really should go and get the latest Linaro GCC etc too.

armv7 is what e.g. Ubuntu will target in their upcoming ARM releases, so it seems very relevant how that performs. Compiling all software with hardware specific CFLAGS is typically only done by Gentoo or other source based distros.

armv7 is what e.g. Ubuntu will target in their upcoming ARM releases, so it seems very relevant how that performs. Compiling all software with hardware specific CFLAGS is typically only done by Gentoo or other source based distros.

Again, what is the point of running the 7-zip benchmark with no -On optimization setting? This means that atleast GCC will default to -O0 which is no optimization. Just add -O2 or preferably -O3 so that this benchmark ends up being in any way relevant, NO ONE will use 7-zip compiled with no optimizations. You are benchmarking compiler optimization here, what possible point is it then to NOT enable optimizations????

Yeah, while the phoronix test suite framework itself is fine, the choice of benchmarks is very questionable at best.

Let's have a look at the "popular" C-Ray 1.1 benchmark. It can be downloaded from http://www.phoronix-test-suite.com/b...ray-1.1.tar.gz
It is typically run as "./c-ray-mt -t 32 -s 1600x1200 -r 8 -i sphfract -o output.ppm", but changing 1600x1200 to 160x120 lets it run for seconds instead of hundreds of seconds on ARM. Profiling of gcc-4.7.0 compiled code shows the following:

But the real fix is to use "static inline" for the performance critical functions. The one who developed this C-Ray application apparently has no clue about performance optimizations. Or maybe it was done on purpose to make the job harder for the compilers. The compilers, which are configured to use aggressive inlining by default are going to win by a huge margin on this test (trading it for larger binary sizes because there are no free cookies).

Generally, I get an impression that such selection of phoronix benchmarks has been done on purpose. Surely, when having compiler optimizations disabled or benchmarking poorly written code such as C-Ray, the difference between the results from different compilers may be quite significant (and mostly random). Benchmarking properly written code with properly selected optimization options is surely boring, because it is less likely to show surprising wins or sensations

Depending on how GCC was configured (you can see by passing -v), this might be a non-issue, but passing only -march=armv7-a without other -mtune= or -mcpu= options might have resulted in GCC tuning for the Cortex-A8.
You might want to re-check to be sure...

Tuning for Cortex-A8 works good for Cortex-A9 too. They are reasonably similar, and scheduling instructions for in-order dual-issue processor does not usually do any harm for its out-of-order dual-issue twin. Moreover, there are cases when -mcpu=cortex-a9 is bad for performance: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53659 (just filed this enhancement request)