Writing Backwards Compatible Software, Part 1

At HomeAdvisor, as our ecosystem of Java clients, mobile apps, and microservices continues to grow, we’ve been thinking a lot more about backwards compatible software. Typically, we think of software compatibility in two forms: intraprocess (source code, compiled libraries, etc) and interprocess (APIs, messaging, etc). With more than a dozen agile teams all writing code and services that have to work together, keeping compatibility in mind is important for every change we make.

In this post, the first of a three part series on writing backwards compatible software, we’ll look at intraprocess software compatibility. We’ll look at the different ways software libraries can introduce breaking changes, from simple source code level changes to more difficult to track logic errors. We’ll also look at some of the best practices we have adopted to help us prevent writing software that breaks other teams.

Backwards Compatible Software in Source Code

Source code (or compiled code you use from a third party) can introduce breaking changes in a number of ways. Luckily they usually appear quickly (compile time), have the smallest impact because they are caught during development, and are usually quick to fix. At HomeAdvisor, most of our agile teams provide microservices with REST APIs in front of their core business logic, so upgrading to new compiled code is usually not a problem. However we still have a couple of places where we share source code:

Our large monolothic web applications are not maintained by a single team and depend on lots of core libraries that have lots of committers. These frequently fail to compile due to classes being moved, renamed, or altered in some other breaking manner.

Helper libraries used to build clients and microservices. These libraries are low-level frameworks that every team uses for creating new functionality, accessing infrastructure services, and more.

In these cases, we have to take care to make changes in a non-breaking way. To that end, here are a few of the ways in which we write backwards compatible software in our Java libraries.

Overload Methods

It should be obvious, but you should never change the signature of a public method. Even default and protected access should never be changed since you may not always be sure what code is extending yours. Instead, find a way to overload a method if you need to change its signature. Let’s say you have some method that takes a single parameter:

1

2

3

4

publicvoidfoo(Strings1)

{

// Business logic here

}

What if this method now requires a second parameter? Create a new method with two parameters, and move the business logic into the new method. The original method simply delegates to the new method, providing some sensible default:

1

2

3

4

5

6

7

8

9

publicvoidfoo(Strings1)

{

foo(s1,DEFAULT_VALUE_FOR_S2);

}

publicvoidfoo(Strings1,Strings2)

{

// Business logic here

}

This approach lets users call either form of the foo method. And it extends well even if you need to add more than one parameter to a method. generally speaking, business logic should always live in the method with the most parameters, and have all other overloaded methods delegate to it. This is a pattern we use quite a bit with our Robusto API client Spring extension for building remote API calls.

You can use the same approach for dealing with changes to return types, which are not part of the method signature in many languages. In this case you would simply create a new method that delegates to the existing method, and then does some translation before returning.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

publicFooD foo(intid)

{

Foo foo=FooDAO.getById(id);

returnfoo;

}

//

publicBar bar(intid)

{

Foo foo=foo(id);

// Translate foo to bar

Bar bar=newBar();

bar.setId(foo.getId());

bar.setSomething(foo.getSomething());

bar.setSomethingElse(foo.getSomethingElse());

bar.setNewField(calculateOrLookupNewField());

returnbar;

}

Note that if you’re changing from a void return type to non-void, you’re likely ok because users of your client don’t expect any return type so adding one doesn’t change their usage.

Don’t Delete, Deprecate

If you absolutely must remove a non-private method or class, consider deprecating it first. This can be done using the @Deprecated annotation, which can trigger compile time warnings that alert users they are using functionality that will soon disappear. This allows users of the library to phase out their usages of the deprecated code instead of breaking them immediately. Once you’re sure nobody is using the method or class, you can safely delete it in the next major revision of your library. It’s also a good idea to use JavaDoc to indicate your deprecation plans and alternative methods developers can use instead.

Avoid Non-Private Fields

Any good object oriented design should not expose internal state directly, and instead use getter and setter methods. If you ever need to change the variable name or scope, or even remove it, having them as non-private can break any code that directly accesses them. The getter and setter methods shield users of the library from the implementation details.

Default Methods

Java 8 introduced a new feature where interfaces, which previously could not contain implementation, can now provide default methods. While this has somewhat blurred the lines between interfaces and abstract classes, one area it really helps with is backwards compatibility. Let’s say we have the following interface:

Java

1

2

3

4

publicinterfaceAddressService

{

AddressDTO getAddress(longcustomerId);

}

Now let’s say you need to add a new method that provides a batch operation. Before Java 8, you would have added a new method declaration:

Java

1

2

3

4

5

6

7

publicinterfaceAddressService

{

AddressDTO getAddress(longcustomerId);

// Long live batch processing!

Collection<AddressDTO>getAddress(Collection<Long>customerIds);

}

Now any class that implements the AddressService, for example a database DAO, would have to implement the new batch method or it would fail to compile. Thanks to default methods, however, we can do this for them:

1

2

3

4

5

6

7

8

9

10

11

12

publicinterfaceAddressService

{

AddressDTO getAddress(longcustomerId);

// Not really a batch operation, but it's a sensible default implementation

Now any implementation of this interface continues to work even after upgrading. Of course, you can argue the usefulness of a batch operation that simply loops. But the important thing is that we’ve allowed users to upgrade at their own pace instead of breaking them. We’ve provided a sensible default, while still allowing specific implementations to override it as they see fit.

In fact, this is precisely how the Java team was able to upgrade the core Collections framework in Java 8 to include streaming and other new features without requiring users of those classes to rewrite their code. If not for default methods, any class that extended the Collections framework would have been required to implement the new interface methods to be compatible with Java 8.

Exceptions

One area that can break in a less obvious way is exceptions. Without getting into debate regarding checked versus unchecked exceptions, it’s important to remember that your public methods are contracts to other developers. If you decide to use checked exceptions, it becomes a part of your contract. Adding to, or removing from, the types of exceptions your method throws will break other developers. In these cases you must provide a migration path for them. Overload the method if possible, although this is somewhat difficult because exceptions are not part of the formal Java method signature. You would need to change the method name or parameter list, which feels a bit silly in this case.

If you choose to not used checked exceptions, you still have a responsibility to convey the semantics of how your public method throws exceptions. This is usually in the form of JavaDoc, but regardless it is not safe to begin throwing new exceptions without warning users. Change such as these cannot be detected by a compiler, and users likely won’t know until the first time they unexpectedly see the exception in a stack trace. Therefore it’s important to publicize the use of exceptions in your code, preferably with JavaDoc or other public documentation. Also note that in these cases, no longer throwing an exception is not as dangerous as starting to throw a new exception, as the upstream code will, at worst, have a defunct catch statement.

Dependency Management

We use Apache Maven for all of our Java dependency management. While its shortcomings are well documented, it works well enough for us. And there’s also some nice features we utilize to prevent breaking changes. For example, every library has a group and artifact ID, and Maven defines uniqueness using these two values. This means that only one version of a library can exist at a time in an application. But, if you introduce a new change and have a reason for 1.0 and 2.0 to co-exist, you can change the group and/or artifact IDs to be unique to have Maven include them both. Note that you’ll also need to change the Java package naming structure in each library to avoid Java level conflicts.

This is a technique we are currently using to help facilitate our Elastic Search upgrade. We’re a little late to the party, but we’re currently upgrading from 1.7 to 2.4. Instead of doing a big bang upgrade where every application and microservice has to migrate at once, we’ve forked our Elastic Search client library. The newer version uses a different maven group ID, along with renamed Java packages. This allows any application or microservice to use both versions of the Elastic Search library, which means we can migrate to the new cluster in a more controlled fashion. Using runtime toggles, we can easily switch every application over to the new library (and upgraded cluster) on its own schedule.

Apache also uses this technique for its commons library. Each major release includes a version in both the group and Java package names. This lets, for example, versions 2 and 3 co-exist in the same application. As long as they are careful not to break compatibility within the same major version, applications can happily include multiple versions.

Another useful feature specific to Maven is the dependencyManagement tag in POM files. This lets you specify an exact version to use for a given group and artifact, which overrides any versions that might be pulled in transitively. For example, if your application depends on two libraries that themselves have a common dependency. The dependency tree would look as follows:

1

2

3

4

+A:1.0

\-C:1.0

+B:2.3

\-C:1.1

Maven normally uses a closeness heuristic to decide which version of the common library to bring in, but in this case the common library (C) is the same distance away in either case. So the most likely outcome is that version 1.0 will be included since its parent (A) appears first in the dependency tree. But what if version 1.1 has a new class that B actually uses? Adding new classes is always backwards compatible, but in this case you’ll encounter NoClassDefFoundError exceptions because Maven has to pick a single version of library of C to include. And in this case it will pick the lower version, which doesn’t meet the compile time requirements of B.

The solution is to include a dependencyManagement section in your application POM to explicitly declare which version of C you want included:

XHTML

1

2

3

4

5

6

7

8

9

<dependencyManagement>

<dependencies>

<dependency>

<groupId>com.homeadvisor</groupId>

<artifactId>C</artifactId>

<version>1.1</version>

</dependency>

</dependencies>

</dependencyManagement>

Now, even if new libraries that depend on different versions of C are added to the POM, you will always get the same result when Maven picks which version to include. This is a technique we use a lot to enforce consistency with our API client library, as well as with common dependencies in our microservice stack.

Spring Beans

We make heavy use of the Spring project in both our monolithic applications and microservices. Missing beans tend to manifest at runtime instead of compile, but there are still some best practices you can incorporate to minimize bean creation exceptions. At HomeAdvisor, we’ve adopted some guidelines when creating new beans.

First, if a Bean is an interface that is implemented by multiple classes, for example a DAO that is backed by a DB and Cache implementation, assign each one a unique name. Additionally, we advise our teams to use the @Primary annotation on whichever implementation is preferred most of the time. This is the implementation that will be auto-wired by default. Otherwise, use the @Qualifier annotation when auto-wiring beans into your apps to explicitly declare which one you want for your particular usage.

Another best practice is to configure component scanning to use a package white list instead of black list. For example, most of our monolithic web applications simply component scan the base package com.homeadvisor, and then exclude a few specific packages below that. This causes issues (usually in integration testing after all stories are merged) because applications will find a new bean definition, but likely won’t have the necessary libraries included at runtime to instantiate it. This can lead to some pretty verbose and intimidating stack traces, all because you’re trying to create a bean that your application doesn’t need anyway. The better approach is to explicitly declare only the packages you absolutely have to scan for the application to function. In fact, you may be surprised bow much quicker your applications startup when you eliminate unused beans.

Configuration

Another less obvious way new code can introduce breaking changes is with application configuration. This is another area where compilers won’t help you, and therefore bugs likely won’t be found until runtime. At HomeAdvisor we see this form of incompatibility in various forms:

When a library requires new configuration, applications that upgrade to the new version may not know to include it. This can lead to a wide range of problems, some benign and some potentially fatal. It may simply be

Changing the name of existing configurations. If applications don’t adjust their existing configurations, they fall into the same trap as if they didn’t provide any configuration at all.

To that end, take care when introducing new configuration or changing existing conventions. If new configuration is required, try your best to provide a reasonable default. At HomeAdvisor we use a series of INI files for most of our configuration (with the ability to add JVM args to specific applications as a final override in any environment). Each library and application can have its own INI file, so we tend to put defaults in there when it makes sense. For example, configuration for health check thresholds would have a default value that makes sense for most environment. This way any application that needs the health checks at least gets a reasonable default, and we can later tune the thresholds in each environment using Puppet controlled files. For configuration related to connections strings or credentials, which are likely to be different in each environment, we make sure our DevOps team knows to generate the necessary configuration as part of the deployment process. This is usually done via Jira sub-tasks assigned to the DevOps engineer in charge of that particular release.

If you need to change the naming convention of configuration, try supporting both the old and new ways concurrently so that users of your library can update at their own pace instead of breaking right away. This is an approach we took with our API client framework recently. Originally, it only offered global configuration that affected every API call the client made. But we soon learned that things like connection and read timeouts are difficult to apply globally. So we added the ability to configure each API call a client makes individually. In doing so, we still respected the old configuration values while also looking for the new ones and having them take precedence over the old ones. Eventually in a future major release, we’ll stop supporting the older configuration values.

Conclusion

Breaking changes in software are inevitable. Business requirements, technology stacks, development teams, and more all change over time. What works today may not work in a year or more. By taking some simple actions with your current code you can hopefully save some pain for future upgrades. The key is to be flexible in what your support, apply common sense best practices for the language you are coding in, and most importantly give users of your code ample time to adjust for breaking changes.

In our next post, we’ll look at writing backwards compatible software when it comes to things like APIs, distributed messaging, and other interprocess forms of communication.

Based in Golden, CO, HomeAdvisor’s technology group is comprised of nearly 100 Java ninjas, front end gladiators, QA warriors, U/X experts and other rock stars. We build the technology that helps make HomeAdvisor the best place for homeowners to connect with home service professionals.

Download our Free Apps

About Michael Pratt

Software Developer / Technical Architecture Team

About Us

Based in Golden, CO, HomeAdvisor’s technology group is comprised of nearly 100 Java ninjas, front end gladiators, QA warriors, U/X experts and other rock stars. We build the technology that helps make HomeAdvisor the best place for homeowners to connect with home service professionals.