Thursday, September 25, 2008

You can blame bit rot, API backwards compatibility, plain stupidity, or that we just didn't know any better at the time. Whatever the cause, we have introduced a good deal of accidental complexity over the last eight years or so, and with e4, we have a chance to reduce this or get rid of the "accidental" part altogether.

Here are some examples. If you know of other examples, please let me know!

First example: We have lots of early attempts to define API that had to be tweaked later; to see what I mean, press Ctrl+T for the Open Type dialog and enter 'I*2

Second example: The confusing (and untold) preferences story comes to mind, as a representative of a whole class of problems. Basically, whenever we ended up with many different ways to do the same thing, we were not able to remove old code because clients still depended on it. I am optimistic that the capabilities offered by the platform can be pared down to a manageable list. Ideally, down to something like twenty services, sometimes called "the twenty things" or the "Eclipse Application Model" in the context of e4. Like being able to persist data. Receive input. Produce selections. Schedule background work. Report progress. Provide pointers into the help system. Contribute to the menus and toolbars. And so on, but I don't have a full list at this point so I should write about it when I know more.

Third example: Like many other Java APIs, we live in a kingdom of nouns. We are sometimes joking how Eclipse, for every concept, has an adapter, a factory, and a manager. And an adapter factory, and a factory manager. Sadly, this is not a joke at all. ContextManagerFactory. ModifierHelperFactory. AbstractRefactoringDescriptorResourceMapping. I wish I could make concrete suggestions for how we can improve on this, but I am afraid we need to look at these APIs in detail. It's just a gut feeling that names with three or more three nondescript nouns are making things more complicated than necessary. By the way, my mother tongue is German, so I should be used to putting many nouns together, but I still find something like INodeAdapterFactoryManager is way over the top.

Fourth example: a good deal of complexity and bloat is caused by the proliferation of preference pages, leading to a countless number of lines of code that supports all the different combinations of all the supported preferences. Are we really helping our users by exposing and maintaining all these options?

That's it for today. I'd love to hear what you think about this, or if you have more examples of accidental complexity.

(Disclaimer: I am well aware that e4 will need to be backwards-compatible, so that 3.x plugins continue to run. When I wrote "get rid of" I meant something a little more subtle, as in "move it to compatibility plug-ins so that adopters of e4 don't have to worry about it when they develop new functionality.")

7 comments:

I'd like to see all the singletons being replaced by a nice and simple dependency injection framework.

In addition we should really strive for allowing bundles being started and stopped at runtime (aka dynamic modules). Dependency Injection will help here.

I think there's no doubt that we need to further enhance our APIs but need to be backward compatible at the same time. Maybe a more progressive API-lifecycle (i.e. more use of deprecated and more removement of things which have been deprecated for some time) would make the code base clearer.

Regarding the I*n interfaces, I remember feeling a combination of anger and disappointment when I first learned about this idiom way back ;) I had been coding around what I thought were shortcomings in the interfaces, and had no idea I had more at my disposal. Then the disappointment set in when I realized that this was actually the only reasonable way to evolve interfaces while persevering binary compatibility. In addition to complexity, I think that one of its biggest costs is discoverability for newcomers to the API. At least that was the case for me.

To address this, in select places where it was appropriate, on Mylyn we have used abstract classes with no implementation instead of interfaces (similar to the JDK's MouseAdapter pattern). This has allowed us to add methods without breaking clients. In some places we have only extracted an interface from those abstract classes when we had a release or two of feedback on extensions of the abstract class.

Earlier this month I spoke with Juergen Hoeller, the Spring Framework Guru, and asked how they deal with this problem and presented our approach. He's taken the approach of indtrsf giving the I*n interfaces a descriptive name and trying to separate their roles. Sometimes that can turn out more awkward, and you can into an awkward noun game with IMumbleFooExtension style naming.

My current thinking is that the I*n approach will never go away, and that we should admit defeat and use the consistency of this approach to help developers with additional tool support. For example, when implementing an interface or doing content assist in a class there could be an "Also see IMumble2" pointer that would make the developer aware of the additional interfaces. Alternatively, we could try to convince newcommers to hit Ctrl+T on ever single new interface they across :)

Sven - yes, I agree. I just wasn't sure if I should mention dependency injection in this post. Unfortunately, the word 'dependencu injection' sounds so, well, complicated and unnatural, while the concept really is simple and elegant. But I am getting ahead of myself... I should write a dedicated post about dependency injection.

Mik - yes, in many cases, abstract base classes are a much better choice than interfaces. Way back when we worked on Eclipse 1.0, we tried to use interfaces for everything and ended up overusing them.

Another technique that helps is separating client API from provider API. If nobody else is supposed to implement your interface, you can extend it without breaking binary compatibility. For provider API (sometimes called SPI), you can then use separate mechanisms that work better, for example abstract base classes, or a mixin-style with interfaces that won't change over time, but where clients can optionally implement additional interfaces when they step up to newer versions.

Finally, a lot of cases where we ended up with I*n names are listener interfaces where we felt the need to add more detail to the generated events. The best practice for this is to have only a single method in the listener interface, handleEvent, and to pass in an event object that can grow additional fields over time.

@Mik and Boris' reply: the lesson I learned is, "Never hand out an interface for someone else to implement". Period. Mik, your solution of abstract classes to subclass is I now believe the right approach and I think we've started doing that now in the platform (but too late).

As for the many nouns, its an excellent point. Gads it makes my head hurt. In general, complexity breeds more complexity. But that's a simplistic answer. I'd like us to take a particular example and kind of turn it into a case, maybe we'd then understand the more general pattern that gets us into trouble. I'm not sure if its driven from a need to make things extensible in ways which weren't required. We should look at places where we left the door open for extension/modification and see if anyone actually did so. Another thing I've noticed is that we sometimes try to be too helpful in giving folks reusable bits, which then drives up the complexity for us, for little real gain.