derekn has asked for the
wisdom of the Perl Monks concerning the following question:

I am trying to calculate percentages. For example, user has 5 choices, each choice will be displayed as percentage of total votes. The problem is that percentages are not displayed nicely as whole numbers (eg, 92.84513%). When using rounding methods to get this to the whole number (93%), the numbers sometimes don't add up to 100 as they should, thus making the percentage reflected inaccurate. Sometimes it's 99, sometimes 101, so on. I have used $percent=sprintf("%.0f", $value) to calculate this but no luck. Any ideas how to accomplish this so that they add up to 100%?
Derek

What you're asking is not possible. Anytime you round a number you're going to introduce error, how much error will depend on how much you're rounding. Add enough errors together and your total will always be off from the "expected" total (in this case 100%).

The only way around this is to go ahead and round the individual entries to whole numbers for display, but when calculating the total don't add the rounded entries, add the unrounded entries, and then round the result for display, if you want.

If you add the UNrounded numbers percentages, they should total 100% (except for the fact that you'll sometimes run into value/count pairs that are rounded at the end of whatever length decimal value you use: 100/6, for example).

But, for cases such as I infer yours is, a quite standard and commonly accepted practice is to include the disclaimer "Totals may not equal 100% because of rounding."

Update: For clarity (in light of OP's next reply), s/numbers/percentages/ at strikeout above.

ysth's node above has a link to a nice essay on fudging numbers so that they round up to 100. Apparently in the author's company, they fudge the numbers to add up to 100 so that the help desk isn't inundated with complaints about "mistakes" in the reports the publish. So there may be some situations where, reality aside, one may really need to make those numbers add up to 100!

The question then becomes how to do this so that one minimizes mistaken impressions. One's choice will depend a great deal on how one expects people to view the numbers. If one thinks that readers are making judgements based on absolute percentages then you will want to add your fudge factor to the largest numbers. Adding 1 to 1% doubles it whereas adding 1 to 98% is rather insignificant.

However, percentages are relative measures by nature. Thus one might also assume that readers are making judgements based on relative percentages more than absolute percentages. In that case, one might argue that fudge factors should be randomly to the percentages to avoid
bias. I don't know which is best. I found several articles on subjective perceptions of statistics via google, but most of them were from paid collections and would have required a trip to the university library. Unfortunately, I didn't have the time to look them up.

The article ysth linked to also had a nice sample of test data, so I decided to work up the case of random assignment of fudge factors along with a test suite based on Test::More.

The test suite is wrapped in a subroutine, runTests to make it easier to test alternative algorithms. If you would like to try your own alternate algorithm against the test suite, pass a code reference. Alternate fudging routines should accept two parameters: ($precision, $aHistogram). $precision is the number of decimal digits in your total. For example, if $precision == 2 then your percentages must add up to 100.00. $aHistograph is a histogram whose numbers can add up to anything. The fudging subroutine is responsible for converting them to percentages.

You can have the quantized percentages to add to 100 but doing so will increase the quantization error compared with rounding. Doing so minimizes the aggregate error rather than the individual errors. While others have advocated minimizing the individual errors, there may be cases where minimizing the aggregate error is preferable.

The following example demonstrates one way the aggregate error can be minimized. The implementation is crude, not well tested and replete with print statements which may help you follow what it is doing.

Let's say you've got 5 percentages. Sort them, from high to low, then make the smallest one 100-(sum of the 4 bigger ones). This forces them to add up the way you want, but it pushes all the round off error into the smallest percentage. Another way is to make the largest value 100-(sum of the 4 smallest). This pushes the error into the largest value. Neither way is "correct" in a strict mathematical sense, but I'm assuming that's not much of a priority for you anyway.