Binary Field Search – re-entrant re-implementation of re[-]cursion

2010/06/21

(say that i++ times fast!)

As part of the main product I’m working on right now, the master node must be able to identify and communicate with multiple nodes on a shared bus in an addressable manner. This requires a mechanism to enumerate the devices in a reasonably controlled time based on their serial number or other signature data. Because it’s a shared medium, asking all the stations to respond just ends up with a complete jumble on the bus. Stepping through the entire potential space saying “does 1 exist? does 2 exist? does 3 exist? …” is wildly impractical given the nominal 32-bit search space – asking 4+ billion times isn’t in our budget…

The solution is a derivative of the standard binary search. Normally you’re only looking for a single value, so you just keep checking digits until you find the one correct value. For 8-bit, you’d start with “greater than 0x80, or less?”. If less than, “greater than 0x40, or less?”. If greater, “greater than 0x60, or less?”, rinse repeat. The challenge to a field search rather than a [single] value search is that once you’ve honed in on the first value, you have to back up and continue searching the rest of the space.

I previously implemented a strictly recursive version for ease of development (since I was fighting with the mechanism of asking the question at the same time). It’s a fairly trivial extension of a value search: instead of breaking all the way out after finding a full match, it doesn’t. Here’s [pseudo] pseudo-code for it, with the highlighted code being the difference between value and field search:

(beware I’m writing this off the top of my head and I can almost guarantee it’s not strictly correct)

A value search returns up through the entire recursive stack as soon as it finds the first match. Remove the return stack and let it keep going, and you’ve got a field search.

The problem is that this implementation is monolithic, and runs until completion. For a large enough number of stations it can take many hundreds of queries before it finds all the nodes. For the product in question that’s not a fatal problem, but I’d really like to be able to send other packets on the bus with a particular regularity that would be precluded by having the search run to completion. As a result, a recursive technique isn’t in the cards since I have no threading stack on the chip. That means developing a re-entrant form that works on the same principle.

What I’ve developed (with no pretenses that it’s new, FWIW) is a fairly basic state machine that uses the value and curbit variables alone, rather than in combination with the current program counter as it steps from the clear to set portions of search().

The basic rules are simple: every call generates a query based on the current value and curbit, then modifies the as appropriate depending on the success of the query and the bits present in value. Initial state is value = 0x00000000 and curbit = 31. For each iteration: If the search matches, and we’re not on the last bit, simply move to the next bit down (curbit–) and start over. If we are on the last bit but it’s a zero, set it to one before starting over. If it’s a one, we “pop” all contiguous 1’s off the end (e.g. 01001011 pops twice) while resetting them to zero, change the zero above that to a one, and start over. If the search doesn’t match, we check the current bit. If it’s a zero, we change it to one and start over. If it’s a one, “pop” all the ones and change the preceding zero to a one before starting over. If we pop to a bit that doesn’t exist (in this case 32), we’re done with the entire search space.

The search pattern generated by this algorithm is identical to the recursive technique above, but the state is contained completely in the variables rather than the code pointer. This means I can intersperse other packets in the bus while the search continues independently. In my case the add_station() code actually tells the station in question to stop responding to these queries, so unless more stations show up on the bus between runs, most of the time will be spent repeating the first two queries to no avail (0xxxxxx? 1xxxxxx? Bueller? Bueller?). Since the purpose of this state of the product is to keep catching new stations being physically attached to the bus, most query sequences will only find a single station.