Java Performance: Efficiently Formatting Doubles

In my book Java
Performance Tuning, I considered efficient ways to convert numbers
to strings, and I provided conversion algorithms that are faster than
those released with the Java SDK (up to and including version 1.3). In
this article, I'll consider how to format doubles using a
variation of that conversion algorithm, and show that the conversion
of numbers with formatting can be faster than the original unformatted
conversion.

The SDK conversion methodology

First, let's look at how the SDK converts doubles to
strings and see why that procedure is slow. Calling
Double.toString(double) to convert a double
value to a string immediately creates an instance of
java.lang.FloatingDecimal, a class that is private to the
java.lang package. FloatingDecimal is the
internal SDK class that provides all the logic necessary to handle
floating point numbers. If you look at the source code for this class,
you will see that converting floating point numbers to strings appears
to be horrendously complicated. The algorithm needs to deal with
several special cases: infinity, -infinity, not-a-number, -0.0, as
well as digits before and after the decimal point, exponential values,
and rounding accuracy. In addition, the SDK conversion algorithm works
out the minumum number of digits necessary to identify uniquely the
underlying bit-storage value.

This last factor is a little subtle: I'll illustrate it with an
example. The number 5.39E-322 is relatively close to the smallest
IEEE-754 double value. The underlying double
representation cannot actually hold the full accuracy of a number that
small (note that this representation is not Java-specific: all
IEEE-754 floating point numbers in all languages have the same
limitations). Instead the double value is held as
5.4E-322. The number 5.41E-322 is also stored to the same accuracy,
resulting in the same number, i.e. the code

In fact, the underlying bit-stored number is closer to
5.396039603960396E-322. However, the SDK algorithm correctly
identifies that all these numbers are stored as the same underlying
value, and that there is no other close double value that
can be stored which requires more than one digit of accuracy to be
shown, i.e. 5.4E-322 uniquely identifies the underlying
double value, while using the minimum number of
digits. (The next nearest double value is 5.4455E-322
which is represented by 5.43E-322 using the SDK conversion algorithm,
and this double would be assigned for all values between
5.42E-322 and 5.45E-322.)

A faster conversion methodology

As you can see, converting doubles to strings is a
nuanced affair requiring lots of work. That explains why the SDK is so
slow to convert them. In my book, I presented an alternative algorithm
to convert doubles to strings. The algorithm works by
determining the magnitude of the double (there is a rapid
algorithm to do this using some of the bits of the
double), then scaling the double value into
a long and converting the long value using
another efficient algorithm. Essentially, the algorithm looks similar
to

The long-to-string conversion algorithm I use is also
faster than the SDK version, as I use an algorithm which successively
strips off the highest digit, thus avoiding the use of an intermediate
char array or StringBuffer to hold a
partially built string. (The documented code for all algorithms is
included in the source code for this article.) The nth digit
of a particular long can be obtained using a relatively
simple formula

By successively stripping off the highest digit, the numbers are
printed or appended in order. This means that there is no need for any
intermediate temporary structures when printing numbers using my
method. This can be a great help when printing large amounts of data
to streams. Avoiding the temporary objects that are generated during
the normal SDK number-to-string conversions both speeds up the
conversion and avoids garbage collection further down the line.

In my original double-to-string conversion algorithm,
I didn't bother with printing the shortest unique representation of
the double. In fact the algorithm always prints out all
the digits available, up to about 15 decimal places. During the
conversion, however, rounding errors mean the last few decimal places
are not necessarily correct. Therefore, if you need full accuracy to
the 15th decimal place, then you should not be using my conversion
algorithm to print double values. I'd also argue that you
shouldn't be using doubles at all if that is your
requirement: instead math manipulation packages and
BigDecimal are probably more appropriate for your
problem.