Benchmarks for Whole Phones Needed

Benchmarking is a tricky business. There are benchmarks and there are benchmarks. I'm sure I'm not alone in my weariness with reading reports on yet another batch of benchmark results.

MADISON, Wis. — Benchmarking is a tricky business. There are benchmarks and there are benchmarks. I'm sure I'm not alone in my weariness with reading reports on yet another batch of benchmark results.

Among the more notorious examples is the recent news about Intel's Z2580 application processor, codenamed CloverTrail prior to launch, which outperformed competitors' processors in a benchmarking exercise. The report, issued by ABI Research in early June, concluded that Intel has succeeded in reducing significantly the power consumption of its smartphone application processor and now rivals equivalent processors based on the ARM architecture licensed from ARM.

Subsequent reporting and investigation, however, revealed that ABI's conclusions were derived from one outlying benchmark (done by AnTuTu). The market research firm neglected to compare results from a suite of benchmarks.

To be clear, in the electronics industry, there's no shortage of benchmarks. Benchmarking exercises are carried out by various outfits for just about every purpose -- from CPUs, GPUs, and DSPs, to FPGAs, and more.

"There are surprisingly very few benchmarks available that capture the performance of the whole phone," said Jeff Bier, president of Berkeley Design Technology, Inc. (BDTI). Dissecting the performance of technology on a component level is one thing. "But what about measuring technology in terms of what matters to consumers, such as battery life, application speed, and network performance for a variety of mobile phones and tablets?" Bier asked.

BDTI last week announced that it has teamed with Qualcomm to create a new user-experience rating for mobile devices such as smartphones and tablets.

For BDTI, a technology consulting firm known for developing its own benchmarking for processor cores and other technologies over the last 20 years, this ascent up the food chain to focus on system-level performance struck me as a radical departure. Bier, however, stressed that BDTI is uniquely qualified to meet the challenge precisely because of its decades of expertise in independent benchmarking, which has earned widespread trust in the technology industry. "We know what we're doing," said Bier.

Pitfalls
Of course, I couldn't help but point out that BDTI is taking money from Qualcomm to develop this new benchmarking for the consumer experience of smartphones and tablets. How could it be "independent"?

Bier said, "Of course, no model is perfect." The new efforts are being funded by Qualcomm, but BDTI is independently designing benchmarks and experience ratings based on technical merit, he asserted.

"How we actually designed benchmarks will be transparent, and it can be viewed by network operators, OEMs, and savvy technology analysts" for their inspection. "In the interest of checks and balances, our policy is to let others see how we've done it," he added.

For those with legitimate business interests, the source code of BDTI's benchmarking will be available for a fee, said Bier. Journalists and industry analysts can see the results of the new benchmarks for free.

Bier also noted that BDTI is fully aware of many pitfalls and challenges associated with benchmarking efforts.

History suggests that all benchmarks are subject to manipulation. Even if they're developed within an industry consortium, there can be members who know ways to skew a benchmark in ways that stack the deck against competing technologies, pressure others, and get votes from their friends.

Looks like the industry loves benchmarking -I didn't know the hangover from Dhrystone & Whetstone would last this long! I find this whole exercise amusing and largely place this on the bright marketing folks to find that 'differentiation' now that the mobile computing space is getting more competitive.

I think one point lost in this 'system-level' benchmarking is the definition of the 'system' itself! Smart phones are much more than yesterday's pure computing devices. A phone with the best hardware benchmark can end up losing when it lacks the best operating system the user interacts with.

Ideally, I want something that is effectively open source: I want to be able to reproduce the test environment, run the same tests on the same gear, and get the same test results as the people posting thier benchmarks. If I don't, something is off somewhere.

DMcCunney, I agree. As long as BDTI makes its benchmarking methodology transparent, smart people in the industry should be able to spot if there are any anomaly, convenient omissions or modifications done to the testing itself.

I like #2, too. But the problem is that BDTI is attempting to develop hardware benchmarks, and the number of strokes/touches needed to do something is a UI matter, and will be dependent on the OS, the UI it presents to the user, and the apps the user runs. It will have only an indirect relation to hardware, and a well designed phone might handily beat the competition on that measurement while having the least poswerful hardware.

Structured in a manner that will emphasize the strengths of their offerings?

It reminds me of the comment "The nice thing about standards is there's so many of them!"

I don't have a problem with BDTI being funded by Qualcomm to do this, as long as they are open about their methodology, what they are measuring, and what conclusions might be drawn from it. Someone has to pay the costs of developing such a thing,

Once they have something they think works, the next step is to pitch it to the appropriate industry consortium as the standard way you measure what the benchmarks track. Then you make popcorn and sit back and watch the fun.