Tuesday, January 26, 2010

Backward Compatibility

In our day to day work we often use the term backward compatible. We use this term as if it is a binary: something is backward compatible or it is not backward compatible. And yes, this is true if a client directly works with a provider. If the provider can work with clients that were compiled against a previous version then the provider can be said to be backward compatible with that previous version.

So is this always binary? Nope. The reason is the design by contract rule that we all follow, or should follow. In Java, we have this rule embodied in the interface, in C++ we used abstract classes. The primary advantage of design by contract is that now the client depends on the contract and the implementation depends on the contract but the client no longer depends on the implementation. Not only allows this model to have multiple implementations for the same contract, it also makes the dependencies smaller, more concise, and most important of all explicit. This model is depicted in the next figure.

In the OSGi service specifications clients and implementers are bundles. The contract is defined in a package that is imported by both the client and the implementer. The implementation is (normally) registered as a service under the interface defined in the contract package.

Instead of having two parties, where the backward compatibility was binary, we now have three parties making the situation a tad more convoluted. The compatibility is now expressed against the contract package because client and implementer have no longer any dependency on each other. What kind of changes can we make that do not affect the client? Well, these are the same changes we could make to the implementer in the simple case. Adding members to classes and adding fields to interfaces is harmless for clients of these classes and interfaces, they never know the difference. Even semantic changes are ok as long as we specify the evolution rules in the new contract.

However, the situation is different for an implementer of a contract; an implementer is semantically much closer bound to the contract than the client. A client compiled against version 1 of the contract can be bound to a backward compatible implementer that is bound to version 2 of the contract. However, a client compiled against version 2 of a contract must never be bound to a version 1 implementation because such an implementer has no knowledge of the changes in the contract and can therefore not faithfully implement it.

Interestingly, some of these incompatibility semantics show up in the way Java works. Implementers usually implement a number of Java interfaces; not implementing all the methods in such an interface will throw a No Such Method Error when called, clearly a violation of the new contract. In this article I talk about implementing the contract, however. There are many OSGi specifications where the client is also required to implement interfaces for callbacks but they are still considered clients. For example, in the Event Admin specification the client must implement the Event Listener service. These interfaces are called client interfaces and any change in them is incompatible for a client.

Using the contract model, we must take this asymmetric situation between clients and implementers into account when discussing backward compatibility. Almost any change in the contract will require the implementer to be aware of this change. However, there are a few cases where you can change the contract without requiring the implementer's awareness. We had such an instance in the upcoming enterprise release. In the previous release, the Blueprint API had no generics, in this release the generic signatures are added. Generics are erased in runtime, therefore existing Blueprint implementations cannot detect the difference in API and there are no additional responsibilities. Such a change is backward compatible for implementers.

I hope it is clear that backward compatibility has 2 dimensions: clients and implementers. When we make a change to the contract we must ask ourselves if this change is compatible with clients and implementers. Theoretically there are four cases, however, in practice any client backward incompatible change is very likely to be implementation incompatible as well, so there are only three cases left. The remaining question is now how to handle these three cases in OSGi. Obviously, the version attribute is the most applicable place to start.

The only party that knows about the change is the person changing contract. This person must somehow must convey its backward compatibility rules to the client and to the implementer. Surprisingly (well not really), these three cases map very well to the three parts of the OSGi version scheme:

major change - Incompatible for implementers and clients

minor change - Incompatible for implementers, compatible for clients.

micro change - Compatible for implementers and clients

Using OSGi version ranges, implementers can import all versions where the major and minor part is fixed and ignore micro changes. For example, when the package that is compiled against has version 2.3.6, then the implementer should import [2.3,2.4). Clients can import all versions where the major part is fixed. For example: [2.3,3). I call this model of importing different ranges based on the version that is compiled against the version policy. There is an implementation policy and a client policy.

This model works very well but it has one huge disadvantage: it requires that exporters follow the OSGi version semantics and not just the syntax. Unfortunately, we punted on the semantics when we had to specify the version attribute. We did recommend a good strategy but we did not mandate it nor was it complete. In practice, this means that people are not carefully versioning their packages (if at all!). It is always tempting to put the specification version on the package because this makes it clear which version of the specification you're getting when you have to select a package. However, this is the so called marketing version. Netscape Navigator came out as version 4.0 because it had to compete with Internet Explorer 3.0, there never was a version 3.0. In OSGi, we are currently at release 4.2 but if you look at the framework package version you'll find we're at 1.5.1, telling you it had 5 client backward compatible changes and since then one implementation backward compatible change. In contrast Wireadmin is still at 1.0. There are valid reasons for marketing versions but they unfortunately do not encode the evolution path the package has taken. It means that clients and implementers can no longer use a version policy to specify their compatibility requirements and must treat the version as an opaque identifier. The dire consequence of this model is that you basically have to rebuild all dependencies for any tiny change because clients and implementers can no longer reason about backward compatibility.

One solution that I proposed many years ago is to export a package under multiple versions. The exporter knows much more about its compatibility with prior versions, being able to specify the compatibility saves the importer from having to make assumptions. However, exporting a package under multiple versions only supports 2 cases for backward compatibility. If it is listed, it is backward compatible, if not, it is not compatible. As I hope this blog has demonstrated, treating backward compatibility as black and white is not sufficient.

I therefore hope it is clear that the exporter must provide different bits of information for the implementers and the clients. This could be a new version like attribute or it could use something like exporting three independent numbers:

An implementation compatibility number

A client compatibility number

A revision number

The author of the contract package would maintain these numbers and incrementing them when the corresponding compatibility broke. This model seems to combine the best of both worlds. It exposes the different compatibilities without any required knowledge on the importer's side. However, my personal position is that the current version policy works today if people are willing to follow the rules. Anything else will require spec changes. The OSGi has been accurately versioning their packages correctly since 1998. The thousand dollar question is, will others follow these semantics?

P.S. In the past year I've done some experimenting with automatically generating import ranges based on the exported version in bnd and an implementation and client version policy. You can read about these experiments here.

7 comments:

Great post Peter. I think us OSGi and modularity preachers haven't done enough to spread the message of the importance of proper versioning. At Eclipse, we have done well building a platform for having a strict set of versioning guidelines that projects are meant to follow. We also built tools to help facilitate API and version evolution via PDE API Tools.

At Eclipse, things are a bit easier since we follow the same set of guidelines and semantics. I admire your goal of trying to create flexible versioning policies for people to choose via BND.

I don't know what it is, but it seems in the software industry, we have taken such a "laissez faire" attitude towards versions. I can count the people I've talked to over the years on one hand that cared about versioning things properly.

Thanks Peter for a great post! I am glad that this topic is gaining more and more attention. Hopefully people really start thinking about what they are doing and my frustration will come to an end ;-)

As a side note, I like your bnd experiments, the only thing I see that is still missing (which I don't know how to easily solve either) is the picking up of the smallest possible version compatible. BND of course takes the one found in the path, but there might be others as well. Using always the latest and greatest in your build is a serious problem in finding the lowest possible working version (of course you want to benefit from new versions if possible). We had that issue for a while and for that reason decided to drop import version ranges at all, because they indicated a miss leading sophistication and preventing the use in older applications (f.i. servlet version 2.5 if none of these features where actually used). Hard to come by I guess.

@Chris Yes, Eclipse already helps people a lot. Unfortunately it also distracts people when applying versions in PDE. For instance, when adding an Import-Package, the Manifest Editor just pulls the exported version (including all version parameters). This leads (if used naively) to implementation dependencies not necessary. There is also no notion of an implementor or client, but hey it is still one of the best tools we have so far, so don't get me wrong. Just saying there is still a long way ahead of us.

@Mirko: bnd has lots of features that are unfortunately not documented (due to lack of any financial incentive to spent time on this) but that are actually successfully used in the OSGi build. This is a non-trivial build with 1300 bundles from about 130 projects.

In this build, bnd has a -buildpath, which is a list of bundle symolic name version range, just like a package import. Compilations (and the build process) are ran against the LOWEST version granted by the bnd file. This automatically makes bnd have an affinity for a low base version. However testing (which bnd supports with JUnit and the launching API, both in Eclipse JUnit and ant) uses the latest permitted versions. This model seems to work very well and increases the overall robustness because you prevent the version brittleness problem but you test against the most up to date libraries.

@Peter: Wow, this really sounds like a promising feature! Unfortunately Maven doesn't support something like that just yet. Maybe version 3 can pick-up the idea and bind the lowest version in a dependency to the compile phase and the latest version to the test phase (with some switch only of course - not everyone might trust this). Having something like that provided in the standard tool chain would help spreading adoption of correct versioning a lot I think! Let's hope the big players like Maven and Eclipse will adopt something like that soon! Thanks again for providing this valuable insight.

@Chris Aniszczyk (zx):"I can count the people I've talked to over the years on one hand that cared about versioning things properly." The problem with this is "properly" is subjective.

We (You, Peter, Boris etc) know that versioning semantics must be tied to compatibility, but most of the world doesn't realize this. Many people probably assume they are versioning things "properly" and in their mind they are. (Whatever "properly" means in their organization)

We need more education around why versioning and compatibility must go hand-in-hand. Articles like this are great! Next week I'm giving a lecture on this topic at the local University -- Peter, I might borrow some of your material ;).

Thanks for a good post Peter, many points agreed - I have been using the distinction between "marketing" and "technical" versions for a long time to describe the problem. The "affinity to low base version" in bnd really looks interesting, although I can see the limitations due to selecting from only the enumerated set of bundles.

@Ian Bull:Educating about the role of proper versioning may help but I guess it will always do so only partially as long as developers are people (which I hope *will* be long :) Without being cheeky, we have found improperly versioned bundles in the core of one well known OSGi implementation...

What I believe would help substantially is to take the burden of correct versioning off the shoulders of the developers. We have a preliminary OSGi version generator which works on standalone bundles (search for "osgi+version" on dss.kiv.zcu.cz). Would something like this make people willing to follow the rules?