Tracking the States of a Set of Objects by PartitionAn Introduction to the Partition Container

In this post I share a useful programming technique I first saw used
twenty-some years ago while reading through some C code. The technique
combined aspects of an array and a linked-list into a single data construct.
I've here abstracted the technique into a reusable container I call a
partition. (If someone knows of some library in some language that offers a
similar container I'd appreciate a reference.)

A partition is a high performance container that organizes
a set of N sequential integer IDs (which together are called the domain) into
an arbitrary number of non-overlapping groups. (I.e., each ID of the
domain can be in at most one group.) The functionality of a partition
includes these constant time operations:

Given an ID, determine the group to which the ID belongs (if any).

Given a group, find an ID in that group (if any).

Given an ID, move that ID to a particular group (simultaneously removing
it from the group with which it was previously a member if any).

None of the above operations use the heap and are each considerably faster
than even a single push to standard implementations of a doubly linked list.

A partition has not one, but two fundamental classes: Domain and Group.
The set of sequential IDs in a partition are defined by a Domain
object; and a Group is a list of IDs from a Domain. Domain and Group each
have two templatization parameters M and V. IDs can be bound to a
user-defined member object of type M and Groups can be bound to a user
defined value object of type V. Together these associations enable
mappings from objects of type M (with known IDs) to objects of type V, and
conversely from objects of type V (with known Groups) to a list of objects
of type M.

The partition is potentially useful any time objects need to be
organized into groups; and that happens all the time! In this post, I show
how you can use a partition to track the states of a set of like objects.
This is just one possible usage of a partition and is intended only as a
tutorial example of how you can use this versatile storage class.

Contents

Motivation

Suppose you have a set of N objects of some common type each having an
independent state variable. Perhaps you have hardware monitoring the
bit-error-rates on N communications ports and it is your job to categorize
them into signal quality states of CLEAR, DEGRADED or FAILED. Or perhaps
you are tracking the BUSY-IDLE states of N processors in a massively
parallel computer processing environment. Or you might be tracking
the activity of N threads in a thread pool. Obviously, such problems abound.

A common use-case for such systems is to find one or more objects in a
particular state. For example, you may need to find one port in the
CLEAR state for allocation to some new network communications service; or
you may want to find and list all processors in the BUSY state to
perform an audit. If the number of objects is small, you could simply
iterate over them all until you find the objects with the desired
state; but if you have many objects, this approach can be highly inefficient.
To solve such problems for large sets, you will naturally use containers to
hold objects that share a common state; but what type of container should
you use?

For C++ you might group your objects into standard STL maps.
Similarly, for Java you could use HashMaps from the java.util package.
These classes provide convenient solutions, but such heavy-weight
containers may be prohibitively slow for some applications.

A lighter-weight alternative is to use linked-lists, but then a state
change of, for example, a circuit from CLEAR to DEGRADED now requires you
first find the circuit in the CLEAR list so that it can be removed from that
list before adding it to the DEGRADED list; this search can again be slow for
large sets. In the C++ domain you can remedy this by giving each circuit
object an iterator that tracks its own location in the linked-list of which it
is currently a member. This is actually an effective (if slightly cumbersome)
solution. In the Java domain, such an approach is not possible if you restrict
yourself to standard library containers which suffer from the limitation that
iterators point between two members and are in any case necessarily invalidated
whenever the container they reference is updated. (Oh, the horror! I need a
Prozac!)

The partition container is specialized to handle exactly this type of problem
and is both easier to use and faster than any of the above alternatives.

An Example

The problem

A partition is often useful for solving resource management problems. For our working example, we'll track the state of a box of eight crayons being shared by my two girls Becky and Cate to draw pictures. (This is not the most compelling resource management problem, but it is at least easy to understand and certainly doesn't look like any proprietary software I've ever written.) I want to show how you might model a resource that has multiple states, so we'll give each crayon both a User state (which can be one of BECKY, CATE, or BOX), and a Quality state (which can be one of SHARP or DULL). The programming problem is merely to efficiently keep track of this state information for all eight crayons and support a variety of use cases, like "find a sharp crayon in the box", "get the state of the red crayon", or "return all of Cate's crayons to the box".

Now create a class CrayonManager to track the crayon states. Start by giving CrayonManager a Domain to hold the crayons. We'll use the partition to map from a crayon color to a CrayonState; accordingly, the Domain has templatization parameters string and CrayonState. Alternatively, we could have defined two Domains (one to map crayons to user state, and another to map crayons to quality state), but it turns out to be more useful to have a single Domain that maps to a multi-valued state.

Add a method to get the current compound state of a crayon. Remember that the state of a crayon is defined by its Group membership. The Domain's getValue() method finds the Group of the given crayon, and returns the user-assigned value of that Group (which is just the CrayonState object we set above). I've also added a couple of convenience methods to return just the crayon's User state and Quality state.

The setState method sets a crayon's full compound state by moving that crayon to the Group representing the given state. Note that we only have to add the crayon to its new Group — the crayon is removed from its previous Group automatically. I've also provided convenience methods for setting each component of a crayon's state independently. For simplicity, I'll largely be ignoring exceptional conditions, but here I at least go to the effort to ignore attempts to set the state of the NONE_AVAILABLE crayon.

findPreferred will try to find a crayon with the requested quality and user state. If none exists, it will look for a crayon with the opposite quality but that same user state. If the user has no crayons at all, it returns NONE_AVAILABLE.

So we can see what's going on, update the state setters to print the state changes that are taking effect, and define a streaming operator for the crayon manager itself that prints the current state of all crayons.

Finally, add a main routine to exercise things. To do it right, I'd have to define what to do when any of the find methods return NONE_AVAILABLE, but for this tutorial I just pass these failures onto the state setters which ignore them.

Conclusions

A software partition is an exceptionally fast container useful for organizing a set of
like objects into groups. Applicable programming problems are common: I personally have
used a partition-based approach to achieve elegant solutions to circuit state modeling
problems, problems in graph theory, and others. If you too are a software developer,
I suspect you'll find uses for Partition as well if you only look for them.

Software Downloads and Documentation

This software is protected by the GNU general public license version 3. This is free software (as defined by the License), but I'd very much appreciate it if you leave a comment to let me know if and/or how you've found the software useful.

I have not gone to the effort to put together a commercial license, but if someone is interested, I'll make one available.

Although I've used partition programming techniques at multiple companies during my software career, this design and implementation is new and should be considered experimental. I'm not sure I'm satisfied with the public interface (particularly due to arguably cumbersome method names that resulted from the decision to not include a reverse iterator). Future versions may not be backwards compatible. This software has been extensively unit-tested, but only with gcc version 4.4.3 on Ubuntu 10.04.4. Please, please contact me if you discover platform compatibility problems; bugs; design deficiencies; and/or documentation errors (including misspellings or grammatical errors). Thanks!

About matt

I live in Lakewood, Colorado with my wonderful wife Tammy and three fine
children Aaron, Becky and Cate. I'm an electrical engineer by education, but worked as a software developer for about 20 years. I recently took a promotion to full time care giver, but will likely return to the grind once the kids are a little older. I coach basketball, sing in two different church choirs, tutor mathematics and physics at a local high school, and spend what other time I can find pursuing home improvement projects and several pet software projects.

Your email address will not be published. Required fields are marked *

Comment

Name *

Email *

Website

Your comments will not appear until I've had a chance to screen them. This
usually takes a day or less, but could be longer if I'm canoeing or camping or
am otherwise beyond the reach of the internet. I respond to all
questions.