gr0k has asked for the
wisdom of the Perl Monks concerning the following question:

I'm working on an accounting problem where we have a list of checks and an amount. We need to find all possible combinations of checks that add together to make that amount. Basically if we had a record set like:

A - 10
B - 5
C - 13
D - 3
E - 15
F - 1

And we're searching for amount = 16. That would return 3 possible matches:

A + B + F
C + D
E + F

I came up with a way to do it using the Algorithm::Loops module and calculate all possible permutations down to a certain depth and ignore duplicates (such as ABF, AFB, BFA, etc are all the same thing). The problem is, we need to be able to search through potentially billions of combinations which takes a looooong time. I'm hoping to get some help optimizing my code to make it run faster. Right now with some tests I've run, this code can go through about 5 million combinations in about 3 minutes on a 2.4 ghz.

It will take me a while to try and figure out what you're code is doing, but do you mean this method will only work with up to 32 checks? My recordset could have hundreds of records in it. My example just had 6.

The limitation is just the typical number of bits in an integer on 32-bit machines.

Let me reiterate Abigails sound advice, however. Say you have 100 checks. Then any exhaustive search will take 2**100 tries. That is around 10**30, so for 1 msec per try, the program would take 10**27 seconds. The age of the universe is only around 5 * 10**17 sec, so exhaustive search is hopeless.

Even creating a more clever branch-and-bound algorithm will only potentially reduce the factor of 2 a little.

So I think you need to rethink your task. Do you really need exact amounts, or can you approximate? If all the checks were even, but the desired amout was odd, there would be no solution; would your business collapse at that point? Look at what you can do to relax the constraints. There are fine polynomial heuristics for the knapsack problem that get you close in a modest time.

This does appear to be a bit faster and looks like it will work for what we need it to do. Have to do a bit more testing with our data. One question though, I notice it produces a lot more combinations, any idea how I could calculate the number of combinations from the number of records searched? That way I could output some status information to let them know how far they are into the search. Thanks!

gr0k,
Well, it depends on what answer you want. To be honest you were checking the same number of combinations as me. The difference is you weren't counting ones that you discarded. There was effort spent on determing they could be discarded though - which is why I included all combinations.

In order to make my $count look like your count will require more work - and make the process slower. If you want - you can change the position of $count++ to after the for loop. There you would only be counting the number of combinations that did not have a duplicate number in them and whose combined value was not greater than the target sum.

I would recommend in the real problem removing all candidates that were already larger than the target number.

Upon closer inspection your code has a bug where it would eliminate two checks with the same amount since you're checking an array of amounts. :( I'll have to modify it to allow that and run some tests to see if it will still result in a speed increase...