JVM Languages

Sorting in Java & C#

Source Code Accompanies This Article. Download It Now.

David uses reflection to sort a list of complex types using both Java and C#.

January, 2006: Sorting in Java & C#

David teaches and writes about Java and C#, and can be contacted at dph@apk.net.

Sorting lists of built-in types or primitives in Java or C# is straightforward. Both languages offer sorting routines that work on in-memory lists of comparable itemscomparable because each item in the list implements a "compare" interface expected by the list's sorting routine. In Java, each element to be sorted implements the java.lang.Comparable interface, and in C#, each element implements the System.Collections.IComparer interface. These interfaces allow types to define a natural ordering of instances of their class.

In Java, the java.util.Collections class can be used to sort lists, and in C#, the System.Collections.ArrayList class also offers a sort routine. These do their sorting by using the comparison routine defined in the java.lang.Comparable interface or the System.Collections.IComparer interface. Numerous built-in language types already implement this interface, so sorting types such as strings, numbers, and dates takes no extra effort beyond loading up the list and calling the sort method.

However, we often need to sort lists of complex domain-specific types such as Employee or Animalnot just lists of built-in strings. In this article, I use reflection to sort a list of complex types through a fully implemented example using Animals, leaving you with the chance to try it out by completing a partial implementation of Employees. Neither Animal nor Employee implements a comparator interface that would make them fit the sorting routines of either language. Users will want to sort a list of domain-specific objects on domain-specific fields. They'll want to sort a list of Employees by first name, or by last name, or by department, or by hire date, by contractor status, by manager, or by salary. Without trying very hard, that's seven possible ways to sort a list of Employees.

Both languages do offer a means for sorting user-supplied complex types by providing an additional sort method that takes both the list and a separate comparator that knows how to compare the elements in the list. Though the comparator is a separate type from the elements in the list, it is coded to know how to compare the elements in the list in some specific way. This separate, standalone comparator works the same way in both languagesit knows how to compare two like objects such that they can be ordered like this:

Obviously, the standalone comparator must know what field of the complex object is plucked from the list to sort onEmployee.salary or Employee.hireDate, and so on. For this to sort lists of Employees seven different ways, there would need to be seven java.util.Comparator implementations (or seven .NET System.Collections.IComparer implementations). The code would then need to pass to the sort method the list of Employees and the appropriate Comparator/IComparer implementation for the chosen field.

A time-tested paradigm to solve this kind of problem is to build an EmployeeComparator factory that knows how to supply all seven Comparators. You obtain the specific comparator for the chosen field by passing a key (such as "firstname") into the factory, and you get back a comparator implementation that knows how to compare Employee objects specifically by first name. This technique works well, and it is possible to lazily build comparator implementations only once and only as needed, store them in a map, and access them quickly by key from the map. Listing One is an incomplete implementation of the Java Employee class, and Listing Two is a partially completed comparator factory for it. (The complete code for sorting Animals with nested Classification references in both C# and Java is available electronically, see "Resource Center," page 4.)

The problem with an approach like this is that you have to code each key-comparator requirement into the factory, as indicated by the case blocks in the sample factory class. This makes it difficult to extend, and cumbersome for types such as Employee, which have more than a mere handful of sortable fields. As your application requirements grow to include associating a Project with each Employee, a new field called currentProject gets added to the Employee class. You then have the ability to display each employee's current project when you show a list of Employees. Because users will see this and want to sort by currentProject, you must go back to the factory and build another case block. Using reflection, you can have one reusable comparator, and therefore, avoid this maintenance.

Reflecting

Both Java and C# let you examine the details of each type that has been loaded into the runtime environment during code execution. The first time a type is loaded into the runtime environment of Java, for example, the environment essentially creates and stores a template of type Class that describes everything about the newly loaded type. When an instance of a particular type is asked for by the application, the template is used to render that instance, identifying all the fields, methods, properties, and constructors. Using reflection, all of these can be made available to applications at runtime.

In Java, the template object is the java.lang.Class object, and in C#, it is the System.Type object. Both can be used programmatically with reflection to look up methods by name and invoke them on specific objects. I access the appropriate field of the complex type using this ability of reflection to work by name. The C# Animal class in Listing Three is an easy domain-specific type to start with; Listing Four is its Java counterpart. They are nearly identical, except that the C# Animal defines age to be a true C# Property of the Animal, available by a "get" property accessor, while all the members of the Java Animal class are accessed by a method. There is a minor difference in the C# class involved if the reflected item is a property versus a method, and the code accounts for both.

Class Animal aggregates just three fields: a string animalType with values such as Monkey, Dog, or Bird; an int for the Animal's age; and another complex type called Classification that basically is used to house values such as mammal, reptile, amphibian, and so on. The Animal constructor wants all three values as arguments. When sorting Animals, the string animalType field and int age field clearly would be built-in types that already implement Comparator/IComparer. The Classification reference of an Animal by default is not sortable, but you can make it sortable, as will be shown later. The three steps involved in using reflection to access the animalType field of an Animal instance by method name are:

Get the template for the given class. In C#, obtaining the System.Type template looks like this:

Ask the System.Type reference to vend the template for a specific method by name. This returns a reference to a System.Reflection.MethodInfo object that cannot be invoked by itselfit is not associated with the original parakeet object or with any Animal instance. It is just a template that knows what the properties of the named method are: public, returns a string, takes no arguments, and so on. Again, the C# example looks like:

MethodInfo method =
t.GetMethod("GetAnimalType");

System.Reflection.MethodInfo inherits from System.Reflection.MethodBase, which in turn, inherits from System.Reflection.MemberInfothe base type for reflective behavior in .NET. In Java, the method reference is a java.lang.reflect.Method object that inherits from java.lang.reflect.AccessibleObjectthe base class for reflecting Field, Method, and Constructor objects. By the way, Java throws a NoSuchMethodException if it can't find the named method, while C# returns a null MethodInfo reference if it can't find the method. This difference in the way the languages treat reflection has a small design impact on the standalone compare routine. The Java code can forego testing for null, but instead wraps the entire compare routine in a try-catch block.

When it comes time to use this method template to invoke the method, the application needs to tell it which Animal instance it should be invoked on, and supply the method with the values of any arguments it requires. For this example, I assume the methods are simple getters or property accessors, and take no arguments. After obtaining the method template for the public string getAnimalType() method of the Animal class, you can invoke the method on the parakeet object via reflection in C# like this:

Because this is the C# implementation, you have to first check for a null method, which is the only way you will know if the GetMethod request worked. Now you have the value of the animalType field (that you want to sort on) in the result that was returned from the invocation of the method.

Sorting with Reflection

Using the reflection abilities of both languages, you can bypass the factory altogether and reuse a comparator to sort anything that implements the interface required for the comparison. This is accomplished by providing a single, reusable, standalone System.Collections.IComparer (or java.lang.Comparable) implementation for all sorting of complex types. At runtime, the reusable comparator uses "reflection-by-name" to access from the complex type a specific field to be sorted on. The comparator then tests whether or not the reflected field itself implements the System.Collections.IComparer interface and therefore can be used to do comparison itself. In effect, it says: Give me by reflection the animal's animalType, and if the reflected animalType string implements IComparer, then let it handle the sorting.

For the remainder of this article, I concentrate on the C# example. The techniques and outcomes are similar for both languages. The starting point for sorting lists of complex types is with the sort method of the Compare.Comparator class:

When you invoke this method, behind the scenes it uses the standalone, reusable IComparer, so you don't need to be aware of how to bring the comparator into play, only of what arguments to pass to the sort method. The sort method sorts the list (of Animals) in argument one by comparing the results of reflection used to invoke the method named in argument two (for example, "GetAnimalType" ). Finally, it renders the results in ascending or descending order according to the direction supplied in the last argument. Invoke the sort routine as in CompareTest.cs or CompareTest (both available electronically), a snippet that does the sort on animalType follows. Recall, the Animal constructor wants an animalType, an age, and a Classification:

Acquiring the static-level lock on the Comparator's System.Type object to enforce thread safety. There is only going to be one of these locks because there is only one instance of the System.Type template per class loaded into the runtime environment. This ensures that only one thread can gain access to the sorting code, which is important because, as a static method, numerous threads could be vying for it and you would want to be sure that each sort invocation is complete before another invocation has a chance to access the comparator Singleton.

Setting the method name in the Singleton that reflection should use.

Setting the sort direction in the Singleton.

Invoking Sort on the list and passing in the Singleton as the IComparer to use.

The Singleton Compare.Comparator.comparatorInstance is the reusable, standalone comparer, and it is an instance of the ListComparator : IComparer class. As such, it must define public int Compare(object obj1, object obj2). This method, of course, is where the guts of the reusable comparison are. This method follows the steps already outlined for using reflection. It attempts to obtain the System.Reflection.MethodInfo object named by the methodName parameter. If it can't get this, it thinks the methodName names a C# property instead, and tries to obtain the System.Reflection.PropertyInfo object by the same methodName.

With either a MethodInfo or a PropertyInfo in hand, it uses reflection to invoke the named getter, thereby accessing the fields of the two Animal instances from the list that need to be compared. The results of the invocation-by-reflection are stored in object references called result1 and result2,respectively. If the method named was, for example, "GetAnimalType" then result1 and result2 hold strings such as "Monkey" or "Alligator". After doing the reflection, the code tests to see if it can forward the actual comparison on to result1 and result2 by using the as operator to cast both results to IComparable.

This is the beauty of object-oriented code at work. Any type that implements IComparable can be compared using CompareTo. Because most built-in types in C# implement the IComparable interface, it is possible to cast to IComparable and let the built-in type handle the comparison. If the IComparable comp1 = result1 as IComparable; cast succeeds for both result1 and result2, then you can return icomp1.CompareTo(icomp2). If you are not comparing built-in types, but instead comparing another complex type such as Classification, the trick is just to make sure it implements the IComparable interface, and it will fit right into this sort method. If the attempt to cast to IComparable fails for either result1 or result2, you fall back on the trick of converting result1 and result2 to type string, and force a string comparison by returning result1.ToString().CompareTo(result2.ToString()). If the return from the reflected method is a null in either result1 or result2, then the code sorts the null to the front of the list without doing any actual comparison.

The very minor downside to this is that the code is plying the wily skills of reflectiona knock against which is that it might impact performance if the list to be sorted is lengthyalthough I have not noted much penalty in performance as I've been using this technique on lists with thousands of items. But the great benefit is that there exists a single Comparator for every complex type that needs to be sorted. You won't need another comparator, nor need to maintain another comparator factory.

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task.
However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Video

This month's Dr. Dobb's Journal

This month,
Dr. Dobb's Journal is devoted to mobile programming. We introduce you to Apple's new Swift programming language, discuss the perils of being the third-most-popular mobile platform, revisit SQLite on Android
, and much more!