Tuesday, February 17, 2009

In this installment, I’m going to discuss Java usages tips and tricks for the Apache Commons Collection Bag interface. Honestly, the first time I saw the Apache Commons Collection Bags API, I thought "Well, that isn't very useful." The Bag interface states the intention of the class; it is to count occurrences of objects. Like all the products in the Apache Commons Collections, Predicates, Closures, and Transformers can be plugged into the bag at run time to determine the behavior. I ran some tests, which are at the end of the article. The cost of using the framework is negligible, at about 10 to 20 milliseconds. This of course is compared to a traditional method of sampling counting occurrences and stuffing into a map. You can be the judge of the test, let me know if you think the test is accurate.

So, what kind of problem justifies adding the complexity of Bags to a project? Joining the magic of a Transformer, MultiKey, and a decorator you can produce a dynamic solution for counting groups of occurrences of properties of beans in a collection. This would be suitable for a generic controller responsible for counting groups of properties. It would be very helpful in a reporting system. Let’s take a look at the sample and walk through the code.

The purpose of code on line 38 through 41 is to just print the properties of the bean, LaborForce, minus the class method. It is important to see how the PropertyUtils describes the bean, because we will be accessing it later via the same API.

On line 43, we create a single HashBag object. I will refer to this HashBag object as the masterBag. A HashBag is an implementation of Bag with the characteristics of a HashSet. When objects are stored in the HashBag, the inserted object’s hash code is assessed, just like a HashSet, and ensures the uniqueness of the object in the Bag as it is stored. You might think the count would be 1 no mater how many times you insert the object, but in fact, the bag counts the inserted and removed occurrences. Even more confusing is the fact that there is an iterator method that returns all multiple occurrences and a uniqueSet method that pulls a set of objects. Line 44 through 46, is where the magic happens. We create three decorated Bags, with TransformedBag.decorate. All of them with variations of the PropertiesMultiKeyTransformer and backed by a single HashBag, the masterBag from line 43. When an object is inserted into one of the decorated HashBags, it is transformed by PropertiesMultiKeyTransformer and the results are physically placed in the masterBag. PropertiesMultiKeyTransformer uses the PropertyUtils to pull all the given properties off the bean to create a MultiKey object. It’s important that the we always put the same property in each object index in the array of objects of the MultiKey. If we didn’t, we would get an exception. Furthermore, we can’t transform the object into an array of objects because HashCode doesn’t exist on an Array of Objects. This becomes a problem when we insert Objects into the bag. The MultiKey is a perfect fit for this problem and that is the reason for its use. This class is effectively a wrapper class.

On line 52 through 58 I ask the masterBag for a set, load the results in a TreeSet that has a chained Transformer. The results are a sorted set of MultiKeys with counts of the occurrences of the vectors in the list.

The inner class PropertiesMultiKeyTransformer is a Transformer constructed with an array of Strings that represent the method names to use with PropertyUtils.getProperty on the beans in the list. It dynamically pulls the given properties from the bean and loads them into the new MultiKey. There is a limit to the number of Objects a MultiKey can accept, but for this demo, we are ok. In addition, the Objects in the MultiKey have to be consistently in the same order for each comparator. We are using a ComparatorUtils.chainedComparator of MultiKeyCompartor which will throw a ClassCastException if the objects are the wrong order.

The code for the data transfer object used within this project is called the LaborForce object. See the following sample for the object. You can find State and Gender in some of the other projects used in this blog.

The sample listOfSampleData length is 156000 ---------- Start Test ---------- With a Bag, Average Time in seconds and milliseconds is 00.032 ---------- End Test ---------- ---------- Start Test ---------- Traditional counting, Average Time in seconds and milliseconds is 00.026 ---------- End Test ----------

Thanks for reading this article, I look forward to any questions or feedback.