Straight from the lab: why a benchmark isn’t the answer to everything

Dominik Bärlocher

Eva Francis

Benchmarks should be there to give a standardised comparison for smartphones and other technology. But the automated tests forget one rather important thing: the people who’ll be using the devices. Let’s take a look behind the scenes of our test methodology.

Benchmarks promise a lot. Above all, they’re supposed to be a reliable, objective and neutral indication of smartphone performance. This leads some people to the conclusion they’re better than any other test. As a professional phone tester, I can confirm that’s not the case.

It doesn’t often happen that I have two of the same type of phones on my desk. But I’ve struck lucky with the LG V30. Not only do I have an LG V30+, a Korea export, but also an EU version of the LG V30.
There are only two things that set these devices apart:

The LG V30+ features 128 GB internal memory, while the LG V30 only has 64 GB

The LG V30+ boasts a hybrid dual SIM slot and the LG V30 doesn’t

The rest of the specs are identical. Now when I run through a benchmark app, the values should be the same.

LG
V30+ (6", 128GB, Dual SIM, 16MP, Moroccan Blue)

Availability

The app I’m using for the benchmark is Antutu Benchmark with the 3D Add On. There are numerous benchmarks in the Google Play Store, but Antutu had consistently good reviews. After discussing options with the mobile geeks in the company, we came across Antutu by chance.

This is where we stumbled across the first problem of benchmark testing. There isn’t just the one benchmark, because anyone can develop and publish their own benchmark app. If benchmarks are to be universal, they’ll have to establish some sort of standard.

But at the moment, this standard doesn’t exist. That’s why any benchmark from any app can be disputed. For the very good reason that another app can regurgitate another figure, which has just as much weight in the benchmark world as the Antutu test.

The result: V30+ wins

I ran ten rounds of Antutu benchmarks. The mobile geeks weren’t in agreement. Each of them thought they knew how to make a benchmark better and therefore more meaningful. After a benchmark, you’re supposed to leave the phone in the fridge for half an hour so it can cool down. It’s also meant to be in flight mode so that data transfer doesn’t interfere with any of the functions.

The LG V30 and the LG V30+ are almost identical

A benchmark that can be influenced by so many environmental factors and that delivers inconsistent data will obviously be doubted. That’s why I decided to carry out the test like this: I’d take both of the phones, do the benchmark test ten times one after the other. Without taking a break, without putting it in the fridge and without waiting for the right lunar phase.

LG V30

LG V30+

1

162116

169016

2

165968

168973

3

158907

163637

4

160792

160500

5

156781

157918

6

147413

152253

7

148210

149940

8

142798

148834

9

173738

173223

10

165803

168960

A quick analysis:

On average, the LG V30 scored 158 252.60 points

On average, the LG V30+ scored 161 325.40 points

The LG V30 scored the highest single value with 173 738.00 points

The LG V30 scored the lowest single value with 142 798.00 points

This shows that, on average, it was the LG V30+ that won the round. The difference was about 3072.80 points or 1.9%.

But something occurred to me as I was carrying out the benchmark tests. Going back to the fridge idea, the purpose is to cool down the phone. The theory is that a cooled phone delivers better and more reliable results. And yet, my test contradicts that. At least anecdotally. To be absolutely sure, I’d need to carry out a lot more tests, which I’d then say were representative based on nothing at all. In the ninth round of testing, both phones gave their highest value and the lowest in the eighth round.

What benchmarks tell you

Benchmarks do have a certain significance. Here’s what I found when I compared two completely different phones, an old HTC M7 from 2013 and a brand new Razer phone.

PickMup

The Note 8 gets defeated. But then again, that’s not really surprising when you check the specs. At best, the benchmark test seems to be a game that confirms your theory; at worst, it’s a waste of time.

What benchmarks don’t tell you

At digitec, when we test phones, we go above and beyond the realms of the benchmark tests. At the end of it, you know what they’re like to use day in, day out – you don’t just get a jumble of numbers from an app. Because let’s face it, you’re going to be using your phone on a daily basis, and even the best benchmark score will hide umpteen factors.

It won’t tell you anything about the bit of dirt behind the glass on my LG V30+, which you wouldn’t notice if you just picked it up once for a benchmark test. The camera speed on the Razer phone wouldn’t have been called into question and the durability of the HTC One M7 wouldn’t have been brought to light.

To uncover these points, to assess them, qualify and quantify them, you need human eyes and hands. At the end of the day, it’s you and not an app who’s going to be holding the phone in your hands and using it to call people, take pictures and send WhatsApps to friends and family. Arbitrary values can be as high as they want, but they’ll never communicate all of this.

And on that note, I’ll carry on testing… just without benchmarks most of the time.