Classloader Hell

I really like Java. It’s definitely my favorite language, and the one I’ve written the most code in, by far. Its balance of static typing, great libraries, OS independence, “no surprises” philosophy, and unsurpassed tool support mean that I’m more productive in Java than in any other language I’ve tried.

But that’s not to say it’s perfect. There can never be an objectively perfect language, since there are too many kinds of software that need to be written, and all languages have expressive tradeoffs that affect their usefulness for those purposes. But even on its own merits, Java’s got some major issues.

The biggest issue is the infamously named “classloader hell”. All reasonably experienced Java programmers know it bitterly well: the difficulties of running code within application servers or other kinds of complex deployment situations, where there are multiple sub-projects all loading code at runtime and attempting to operate with each other. (Briefly, for those who don’t know, Java code is loaded by objects named “classloaders”; classloaders define a name space of loaded classes, and a single class, if loaded by multiple classloaders, is considered to be two distinct — and non-interoperable — classes.)

This is one area where Java’s original standard was written in haste. The model of classloaders delegating to each other, the way that classes are considered to be different if loaded from different classloaders, and the many problems this can cause — all are fundamental problems that manifest in a variety of confusing and confounding ways.

The problem gets much worse when you get into code interoperating between multiple systems on a network, each of which may have subtly different versions of the system’s code. The entire Remote Method Invocation standard was predicated on the assumption that Java’s ability to load code over the network would enable code to travel alongside the network messages that reference that code. It feels to me like a case of “wow, Java lets code run anywhere! Just think of what that lets you do for a networking protocol!” The problem was that the versioning issues weren’t well understood, and that has caused bitter pain in actual usage.

It’s not like Sun isn’t aware of the issues. This technical report covers the basics: it’s possible to set up RMI scenarios where perfectly innocent messaging patterns result in systems being unable to send messages to each other at all. Worse, the runtime behavior is difficult to analyze or understand.

These problems aren’t merely abstract. Another Sun paper discusses an entire advanced networking system that was well into development, based on Jini and RMI, when these problems reared their ugly heads and fundamentally destabilized the entire project. The ultimate reason for abandoning the project was its business infeasibility, but it’s clear that Sun’s own technologies — as currently implemented — were deeply broken.

Now, security, versioning, and interoperability are notoriously hard problems, and it’s far from clear that anyone has really great solutions to them. Certainly Sun can’t be too deeply faulted for not getting it right — Java did get many other things right (including OS independence, dynamic compilation, dynamic inlining, garbage collection, and much more). So I don’t mean to pick too deeply on Java; I just mean to highlight some of the biggest outstanding issues with the language.

There’s work going on to fix at least some of this: for example, the Java module JSR, #277, is intended to bring at least some rigor to the entire classloading system. The .NET framework has a better grasp of module loading issues, though I’m not aware of how .NET messaging compares to RMI or whether it’s exposed to any of the same issues. The Fortress language is taking a much more rigorous approach to component versioning and component assembly as well. But most of these seem dedicated to fixing the module loading problems in a single environment; I don’t know of much work on addressing the security and versioning issues associated with network messaging (though JSR 277 does address some of the naming issues that break RMI and Jini).

Still, that’s the great thing about software: there’s always more that needs fixing, and more great things to build! Expect many more posts on this general topic….

I'm a programmer, dad, and general geek living in the Seattle area. Currently I am a senior software development engineer at Microsoft.

My coding interests run towards language design, parallel and asynchronous development, operating systems, and metaprogramming.
I expect the world of software to be challenging, inspiring, and fascinating for the rest of my life; in few other pursuits is there such great possibility.
Email me here.