Today I want to talk about a problem I’ve never covered, which is how to generate all the combinations of a sequence. The combinations are all the ways to take some given number of elements from the sequence. Or, equivalently, all the subsequences of a given length. (A subsequence of a sequence has elements from the sequence in the same order, but some elements from the original sequence can be missing.) For example, if we have the sequence {50, 60, 70, 80, 90} and we wish to choose three, then the combinations are:

When writing computer programs it is wise to state the edge cases. Given the same sequence, what if we wish to choose none of them? There does exist a subsequence which has zero elements, so we should produce it; the answer would be { { } }. What if we have a sequence of five items and we wish to choose six of them? There is no way to do that; there is no six-element subsequence. So the answer should be { }, the empty sequence of sequences.

The first clever bit is to notice that essentially what we are doing in each subsequence is deciding whether or not to include each element from the sequence. Let’s lay out those subsequences again and indicate whether the corresponding member was included or not, true or false:

It’s a fun problem to approach with just a whiteboard. You write out all the sub-sequences of {A,B,C}, and write out the truth table for whether A, B or C was included. Then you think, hang on a minute, that’s just a binary number incrementing.

Though I’ve not tried solving it for specific lengths so now I’m curious to see if there’s a fancier way than just running through the enumerating and discarding results that are too short/long.

An interesting approach to enumerating combinations, especially if the set of items has 63 or fewer items, is to write a function which, given a value with some number of bits set, will return the next larger value with the same number of bits set. This can be done quickly if a system has an efficient way to divide two numbers which are known to be powers of two (unfortunately I know of no way to express such a concept in C# or CIL).

Start by determining the LSB of the number (which can be done with “lsb = n & (n-1);”. Then compute “temp = n+lsb; newSet = temp & ~n;” which will indicate the highest bit that needs to become set in the next value. Finally “adj = newSet/lsb/2-1;” to determine if more bits got cleared than set, and “return temp+adj;” to get the final answer.

My first thought was to try recursion: given a set s, n, and k, if n == k, return the { s }, if n < k, return { }, otherwise recurse twice: Pick element x from s, and let s' = s – { x }. Recurse with s', n-1, and k-1, and then add x back into those subsets; then recurse with s', n-1, and k. Union the two results and you have your answer.

Of course, there's a bit of overlap here. You'll be calling with small n/k values more than once. Maybe we could want to memoize the results somewhere, as you've done in some examples in the past.

This is basically the approach we’re going to take, though I’ll use a slightly different base case.

Yes, you do end up making the subsequences towards the end over and over again. It’s hard to know what is harder on the GC — many small allocations of structures you’ve seen before that go away quickly vs an ever-growing cache — without profiling. My assumption is that each enumerated combination will be used and discarded, and the GC will clean it up quickly.

I won’t pretend having a proper explanation but I think I found something that seems nice (to me at least) and the idea comes from seeing someone playing accordion xD

(* weird explanation below *)
start with the true at one side (the left for me), and see them as a trail of wagon
advance the first one you can (the first true followed by a false) and if none are found then you’re done
all “wagon” before (the true before the one moved) are reset at start
rinse and repeat

I know it’s not a good way of describing this (and not having english as my first language doesn’t help) but it’s more than nothing 😀

I made a fiddle of all this ; it’s not commented but I hope it’s clear enough
you can see it here (the site requires Silverlight) https://dotnetfiddle.net/mjt7QY

Indeed, I considered describing this algorithm. I see it the same way you do: one of the bits sets off heading to the left, goes as far as possible, and then runs back home and grabs a second bit to go with it on the journey again — and then recurse!

Thinking about using Enumerable.Range as source for generating numbers in another radix and use those to tag elements of a set like you tag numbers with 0 and 1 for the powerset. Maybe use a-z to tag elements in a set instead of numbers in radix26.