str2long Contest Results Part 1

[NOTE: Reading this post only makes sense if you read and cared about this previous post.]

Ok, evaluating the submissions has been more work than I anticipated, and also things have gotten pretty busy at work, so I’m going to split the evaluation of the submissions into two parts. This first part will discuss objective metrics while the second will be subjective.

I received approximately 78 implementations of str2long(), including multiple submissions from a number of people. I say “approximately” since I deleted some submissions where the sender noticed it was buggy before I did. All submissions can be found on Github here. A total of 35 submissions pass all of my tests, which include many functional tests in addition to integer overflow checks using Clang’s -fsanitize=integer. My tests are reasonably good (I hope) but I still could have missed things, let me know if this is the case. I tested all submissions on an x86-64 Linux machine, targeting both 64-bit and 32-bit modes. The nastiest test case that I used is one that allocates a 9GB string that is mostly leading zeros; this caused a few submissions that passed all other tests to fail, of course because they used a 32-bit index somewhere. One might reasonably argue that this is a vanishingly unlikely use case, but it’s not prohibited by the specification so I tested for it.

Besides correctness, I also looked at code size and performance. Neither were explicit criteria for this contest, but all other things being equal faster code is better. Code size is mainly interesting because it serves as a very crude way to measure algorithmic complexity. In other words, code size answers the question: How much complexity remains after an optimizing compiler has looked at the function?

Since compilers are notoriously finicky about what they will and won’t optimize, I used a variety of compilers and report only the best result. The compilers are:

GCC 4.4.7, 4.5.4, 4.6.3, 4.7.2, and a pre-4.8 snapshot (r196703, compiled from SVN a few days ago)

Clang 3.0, 3.1, 3.2, and a pre-3.3 snapshot (r177225, compiled from SVN a few days ago)

Intel CC 12.0.5 and 13.1.0

Each compiler targeted x86-64 and was invoked at -O0, -O1, -O2, -Os, and -O3. I’m aware there are other flags controlling code generation but I don’t have time to play those games. Code size was measured using the size command after passing the compiler flags -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables which are necessary to prevent extraneous code from being generated and counted. With these preliminaries aside, here are the size results:

bytes

compiler

submission

96

icc-13.0.1 -O1

libc

109

gcc-snap -Os

pascal

110

gcc-4.6 -Os

peter

110

gcc-snap -Os

ryan

112

icc-13.1.0 -O1

bernd_2

115

gcc-snap -Os

tordek_2

116

gcc-snap -Os

davidl

116

gcc-snap -Os

davidl_2

122

clang-3.0 -Os

yolanpa

129

gcc-4.6 -Os

daniel_2

129

gcc-snap -Os

francois_2

131

gcc-snap -Os

jeffrey

136

gcc-4.4 -Os

yang_2

137

gcc-snap -Os

davide

138

gcc-4.4 -Os

david_2

139

gcc-snap -Os

thomas

145

gcc-snap -Os

matthewf_2

145

clang-snap -Os

chucky_2

148

gcc-4.4 -Os

mikael_2

159

gcc-snap -Os

greg

160

gcc-snap -Os

phil

165

gcc-snap -Os

mattias

172

icc-13.1.0 -O1

stefan

173

gcc-4.6 -Os

matthew_3

175

gcc-4.4 -Os

sidney

176

icc-13.1.0 -O1

patrick

182

gcc-snap -Os

bastian

185

gcc-4.7 -Os

andrew

203

clang-3.2 -Os

till

232

clang-3.1 -Os

olivier

251

gcc-snap -Os

kevin

275

gcc-snap -Os

renaud

276

gcc-4.6 -Os

ben

283

gcc-snap -Os

ethan_2

312

clang-snap -Os

mats

The libc entry doesn’t count, it’s just a call to strtol that has been wrapped with a bit of logic to make it conform to the str2long specification. Perhaps interestingly, no compiler consistently wins. Happily, -Os is the most popular compiler flag.

Performance was measured by taking the best out of 10 runs on an otherwise idle Core i7-2600. The metric is the average number of nanoseconds required to convert a string including whatever overhead is incurred by the test harness. As Robert Graham observed, the test cases are perhaps skewed too heavily towards invalid inputs, which would tend to penalize code optimized for the valid case.

Here are the results:

ns

compiler

submission

21.4

icc-13.1.0 -O2

bernd_2

21.6

icc-13.1.0 -Os

peter

22.4

gcc-4.7 -O2

yang_2

22.5

clang-3.0 -O1

ryan

22.6

gcc-4.7 -O2

till

23.8

icc-13.1.0 -O1

yolanpa

24.7

gcc-4.7 -O2

tordek_2

25.3

clang-3.2 -Os

andrew

25.6

gcc-4.4 -O3

davidl_2

25.7

gcc-4.7 -O2

pascal

25.7

gcc-4.5 -O3

renaud

26.0

gcc-4.7 -O2

davidl

26.2

icc-12.0.5 -O3

stefan

26.8

gcc-snap -O3

bastian

27.1

gcc-4.7 -O2

chucky_2

27.2

clang-3.1 -O3

olivier

27.5

gcc-snap -O3

mikael_2

27.8

gcc-4.7 -O2

kevin

27.9

clang-3.2 -O2

davide

28.3

gcc-snap -O3

david_2

28.5

clang-3.2 -O1

patrick

28.7

gcc-snap -O3

sidney

29.6

gcc-4.7 -O3

phil

32.1

gcc-snap -O3

greg

33.1

clang-3.1 -O2

ben

33.1

icc-12.0.5 -Os

jeffrey

33.8

icc-13.1.0 -O2

mattias

34.5

gcc-snap -O2

thomas

34.6

clang-snap -O3

matthewf_2

37.5

gcc-4.7 -O3

matthew_3

49.0

clang-3.1 -O1

francois_2

52.8

icc-12.0.5 -O2

libc

96.4

gcc-4.7 -Os

daniel_2

103.3

gcc-4.4 -O3

mats

128.3

gcc-4.4 -O2

ethan_2

Again, no particular compiler wins, and this time no particular optimization flag consistently wins either. The strtol code from libc performs surprisingly well considering that it is implementing a slightly more generic specification than str2long.

These results would perhaps be improved by repeating them on a modern ARM platform and some good ARM compilers, but I don’t have those available right now.

This concludes the objective portion of the contest results. The followup will be subjective and will look at the strategies and idioms that people used to implement the str2long specification without overflowing integers.

21 thoughts on “str2long Contest Results Part 1”

It might be interesting to see performance numbers for the smallest emitted versions of each submission, and the code sizes of the fastest optimizations of each submission. The whole thing could be displayed on a single size-speed scatter-plot with the two sets of points.

“The nastiest test case that I used is one that allocates a 9GB string that is mostly leading zeros; this caused a few submissions that passed all other tests to fail, of course because they used a 32-bit index somewhere.”

I’m guessing that was me, but I see robert_2.c in your results table (no toby.c) which I assumed would have also suffered from this — he uses an ‘int’ index, I use an ‘unsigned int’ one. Am I missing something?

I noticed that. 64-bit constants also eat space. This version (str2long*3*, also linked in my previous post) is at 84 bytes on gcc-4.7 -Os; when I write the conservative overflow test less efficiently but without the constant, that drops to 78.