Speedtest and Comparsion of Open-Source Cryptography Libraries and Compiler Flags

Abstract

There are many well-known open-source cryptography libraries available, which implement many different ciphers. So which library and which cipher(s) should one use for a new program? This comparison presents a wealth of experimentally determined speed test results to allow an educated answer to this question.

The speed tests encompass eight open-source cryptography libraries of which 15 different ciphers are examined. The performance experiments were run on five different computers which had up to six different Linux distributions installed, leading to ten CPU / distribution combinations tests. Ultimately the cipher code was also compiled using four different C++ compilers with 35 different optimization flag combinations.

Two different test programs were written: the first to verify cipher implementations against each other, the second to perform timed speed tests on the ciphers exported by the different libraries. A cipher speed test run is composed of both encryption and decryption of a buffer. The buffer length is varied from 16 bytes to 1 MB in size.

Many of the observed results are unexpected. Blowfish turned out to be the fastest cipher. But cipher selection cannot be solely based on speed, other parameters like (perceived) strength and age are more important. However raw speed data is important for further discussion.

When regarding the eight selected cryptography libraries, one would expect all libraries to contain approximately the same core cipher implementation, as all calculation results have to be equal. However the libraries' performances varies greatly. OpenSSL and Beecrypt contain implementations with highest optimization levels, but the libraries only implement few ciphers. Tomcrypt, Botan and Crypto++ implement many different ciphers with consistently good performance on all of them. The smaller Nettle library trails somewhat behind, probably due to it's age.

The first real surprise of the speed comparison is the extremely slow test results measured on all ciphers implemented in libmcrypt and libgcrypt. libmcrypt's ciphers show an extremely long start-up overhead, but once it is amortized the cipher's throughput is equal to faster libraries. libgcrypt's results on the other hand are really abysmal and trail far behind all the other libraries. This does not bode well for GnuTLS's SSL performance. And libmcrypt's slow start promises bad performance for thousands of PHP applications encrypting small chunks of user data.

Most of the speed test experiments were run on Gentoo Linux, which compiles all programs from source with user-defined compiler flags. This contrasts to most other Linux distributions which ship pre-compiled binary packages. To verify that previous results stay valid on other distributions the experiments were rerun in chroot-jailed installations. As expected Gentoo Linux showed the highest performance, closely followed by the newer versions of Ubuntu (hardy) and Debian (lenny). The oldest distribution in the test, Debian etch, showed nearly 15% slower speed results than Gentoo.

To make the results transferable onto other computers and CPUs the speed test experiments were run on five different computers, which all had Debian etch installed. No unexpected results were observable: all results show the expected scaling with CPU speed. Most importantly no cache effects or special speed-ups were detectable. Most robust cipher was CAST5 and the one most fragile to CPU architecture was Serpent.

Most interesting for applications outside the scope of cipher algorithms was the compiler and optimization flags comparison. The speed test code and cipher library Crypto++ was compiled with many different compiler / flags combinations. It was even compiled and speed measured on Windows to compare Microsoft's compiler with those available on Linux.

The experimental results showed that Intel's C++ compiler produces by far the most optimized code for all ciphers tested. Second and third place goes to Microsoft Visual C++ 8.0 and gcc 4.1.2, which generate code which is roughly 16.5% and 17.5% slower than that generated by Intel's compiler. gcc's performance is highly dependent on the amount to optimization flags enabled: a simple -O3 is not sufficient to produce well optimized binary code. Relative to gcc 4.1.2 the older compiler version 3.4.6 is about 10% slower on most tests.

All in all the experimental results provide some hard numbers on which to base further discussion. Hopefully some of the libraries' spotlighted deficits can be corrected or at least explained. Lastly the most concrete result: the cipher and library I will use for my planned application is Serpent from the Botan library.