Perl’s rounding problem

Perl’s approved way of rounding numbers goes through (s)printf, but there’s a problem. In short, it does the wrong thing.

Most people were taught the rule that 1, 2, 3, 4 round down to 0, and that 5, 6, 7, 8, and 9 round up to the next 0. That means that more digits round up than round down, introducing a systematic bias into any computations where you might round. You shouldn’t have to watch Superman III to realize the disasterous global consequences this has.

There’s more than one way to round a number. Most want to get to the nearest number, but if you are half way between, there are options. There are more than two ways. There are more than three. There are, well, a lot of ways:

Round half up

Round half down

Round half toward zero

Round half away from zero

Round half to even

Round half to odd

Round half alternately up and down

Round half stochastically

If you use the GNU C compiler (or something based on it), you round half to even as the default. Perl relies on this behavior.

Every time you try this you get the same answer (so, no stochastic or alternate rounding). The GNU C compiler can also use floor, ceiling, or truncate, but those have similar problems.

As you are rounding, you are going to get more even numbers than odd numbers. If you are writing banking software, assymetric currency rounding could destabilize a currency. The Risks Digest has several entries for security problems in rounding. These issues are much worse than some lame “attack” on CGI.pm because a programmer can’t read.

The modulus of negative numbers

Among the heated technical debates, such as vi or emacs, tabs or spaces, or Star Wars or Star Trek (the first answer in each is the right one), the ones that matter, such as the correct value of modulo addition with negative numbers, are overlooked.

Binary “%” is the modulo operator, which computes the division remainder of its first argument with respect to its second argument. Given integer operands $m and $n : If $n is positive, then $m % $n is $m minus the largest multiple of $n less than or equal to $m. If $n is negative, then $m % $n is $m minus the smallest multiple of $n that is not less than $m (that is, the result will be less than or equal to zero).

– The perldoc documentation for the % operators

The modulo operators take two numbers and does something to them. For $m % $n, you have:

That unary minus operator is two precedence levels above the modulo operator. That Perl makes one operator better than another is a whole other issue, but that’s the way it is and we can’t fix it now. Try it again. Use the parentheses (a feature Perl stole from LISP, which had some extra to spare) to separate the operators:

But it’s even worse, because those numbers aren’t what the documentation says they should be. “If $n is positive, then $m % $n is $m minus the largest multiple of $n less than or equal to $m”. Let’s take the case of -137 and 12. There are a couple of ways to look at this. If a “multiple” we call $i must be positive, there is no value such that $n * $i will be less than or equal to any negative value. If that $i can be negative, the word “largest” is a bit troublesome. Wikipedia says large numbers are positive.

Fake random numbers

Perl has a rand function. It claims to return “a random fractional number greater than or equal to 0”, but it doesn’t. It’s not random. It’s fake random in a way that might work if you only want to use one of them to complete a homework assignment in a beginning programming course in middle school. Although the documentation includes a footnote saying “You should not rely on it in security-sensitive situations”, it does not say “Don’t ever use this.” like it should. Try this program:

Not only do you get the same numbers, but you get them in the same order. Perl tries to hide this from you by automatically calling srand and giving it a “random” number to start the completely repeatable sequence.

There are a variety of ways this can screw up if you use it in an application that keeps going and going. Eventually you come back to the beginning of the sequence, perhaps colliding with existing customer data.

Perl lets anyone program

Perhaps the biggest problem with Perl is that anyone with a text editor can write a program and upload it to the internet. It’s a feature that Perl allows someone to get their work done, but the problem shows up when someone tries to generalize that to other people’s work. Projects such as Not Matt’s Scripts try to mitigate this by fixing the problem one script at a time. There are simply too many scripts to get to in the lifetime of the Universe.

In conclusion

If you’ve made it this far without complaining on Twitter, Reddit, or Hacker News, congratulations. You know what day of the year it is.

These are real issues, though, and if your application is senstive to small differences in numbers (such as calculating fundamental constants of the universe or pension fund allocations), you’re probably using sophisticated number libraries and have various levels of audits to verify the results.

Site Map

Contact Us

Legal

PerlTricks.com and the authors make no representations or warranties with respect to the accuracy or completeness of the contents of all work on this website and specifically disclaim all warranties, including without limitation warranties of fitness for a particular purpose. No warranty may be created or extended. The advice and strategies published on this website may not be suitable for every situation. All work on this website is provided with the understanding that PerlTricks.com and the authors are not engaged in rendering legal, accounting, or other professional services. Neither PerlTricks.com nor the authors shall be liable for damages arising herefrom.