Cloud Foundry, Forking and the Future of Permissively Licensed Open Source Platforms

A week ago today a minor skirmish broke out on Twitter between Apprenda – purveyor of PaaS software – and advocates of the open source Cloud Foundry project, originally created by VMware. The major point of contention concerned forks; specifically forks of the Cloud Foundry project. In a blog post, Apprenda CEO Sinclair Schuller made two broad assertions: first, that forks were a problem for open source generally, and second that they represented a problem for Cloud Foundry specifically. Unsurprisingly, Cloud Foundry defenders objected – vociferously – claiming FUD. Leaving aside the vendor sports angle to this issue, because it is frankly uninteresting, a deconstruction of the original claims may be useful for examining a larger question about licensing outcomes.

The first assertion is the easiest to address: we reject the notion that forking is an undesirable outcome. Forking is, to the contrary, provably beneficial to modern open source projects – at least from a developmental perspective. While it is true, as Schuller wrote, that long time open source advocate Eric Raymond argued that forking is a negative for open source projects (see “Homesteading the Noosphere“), context is important. At the time that that essay was written, the cost of forking was extraordinarily high. The advent of decentralized version control systems such as Git, however, reduced the friction associated with parallel development to negligible amounts in most cases. As Brian Aker – MySQL author and creator of the Drizzle fork – once put it,

I fully believe that the forking we saw was enabled by the move to [decentralized] bzr/launchpad. Without that move it would have been a lot harder to make that shift for most of the forks and distributions.

With the cost of forking reduced or eliminated entirely, software development is parellelized; much as bacteria evolve more quickly because they iterate in peer to peer fashion, so too can software projects innovate along multiple parallel tracks rather than a single serial development path. DVCS-enabled forking, then, is an enormous step forward for software development.

What is less clear, however, is the impact of forking on platform compatibility in an age of permissively licensed software. In his counterpoint to Schuller’s original blog post, VMware’s Patrick Chanezon pointed to this timeline of the various Linux forks, saying in part that there would be “No Linux of the Cloud without forking.” This assertion is likely correct; certainly it’s difficult to imagine Linux evolving as quickly or successfully without its decentralized – and fork-friendly – development model. As many are aware, in fact, Git – the most popular DVCS tool in use today – was originally written to manage the Linux kernel.

There is one important difference between Linux and projects like Cloud Foundry, however – the license. Linux is governed by the GPL, a protective reciprocal license that demands that modifications and changes to a given project be made available under precisely the same terms, assuming that these updates are distributed rather than maintained internally. Nor are these terms temporally limited: the GPL’s restrictions follow an asset in perpetuity. The practical import of the GPL is that public variants are throttled. The license does not require all shipped distributions of a codebase to be identical, but it does prevent one party from introducing proprietary features to differentiate it from another. Red Hat, for example, cannot ship a kernel that contains code unavailable to commercial competitors such as Amazon, Canonical, CentOS or SUSE.

Permissive licenses, however, contain no such restrictions. They are by design the antithesis of copyleft licenses such as the GPL, prohibiting very little. Permissively licensed code, for example, can typically be consumed by a third party, modified and then shipped as a proprietary product with no licensing issues whatsoever. For advocates of permissive licenses, software freedom means letting users decide for themselves how best to use the asset. For developers, permissive licensing is ideal, because of the lack of restrictions imposed. Perhaps for this reason, perhaps as a result of an inevitable recalibration of licensing distribution, recent years have seen a sustained and significant shift towards adoption of permissive licenses – largely at the expense of the GPL.

It is no surprise, then, that recent projects like Cloud Foundry (2011) and OpenStack (2010) are permissively licensed. Nor that this choice is welcomed by developers and corporate sponsors alike; it imposes less overhead on both. What is curious, however, is that discussion of the longer term implications of this shift with respect to forking are relatively rare.

It’s not that we lack history with permissively licensed projects, to be clear. While many of the higher profile open source projects like Linux or MySQL were reciprocally licensed, the Apache Software Foundation, an organization which considers permissive licenses part of its mission, was founded in 1999. Many popular programming language runtimes – PHP, Python, and Ruby, to name a few – are likewise permissively licensed. These projects and others have proved over a period of decades that permissively licensed assets can grow and thrive just as well as reciprocally licensed alternatives.

Permissively licensed platform technologies, however, are comparatively rare. The distinction is important, because platform technologies offer a different set of incentives than do component projects. Consider the case of OpenStack. As a permissively licensed asset, any consumer of the technology is legally able to produce and distribute a unique, differentiated implementation. And while the project culture and many practical incentives discourage this kind of forking, members of that community remain concerned about fragmentation – possibly because the incentives for vendors to differentiate from one another in the all important cloud market are great, or possibly because we’ve seen it before in projects like Android and its numerous vendor-specific variants. And if it’s possible to envision a world with different, competing variants of OpenStack, it’s not unreasonable to ask the question of projects like Cloud Foundry.

It’s worth mentioning, however, that from a customer perspective, forks or variants are not universally bad. While the various Android versions may represent unfortunate design decisions on the part of the vendors responsible for them, applications are in the overwhelming majority of cases compatible from device to device, assuming version equivalency.

Compatibility, ultimately, is the key to determining whether the forks which are so beneficial to development are a problem for customers. Java, for example, had multiple distinct implementations, which ensured competition and thus continued innovation to benefit customers. Compatibility, meanwhile, was tested regularly by a set of tests known as the TCK, or Technology Compatibility Kit. Without a passing grade, in fact, a given implementation could not use the name Java, and thus would not be acceptable to customers. This seems to be similar to the path Cloud Foundry, for one, is pursuing with its Cloud Foundry Core compatibility test.

While forking is then an unquestionable benefit to developers, its implications for customers – at least for projects that permit competing, non-shared implementations – are as yet unclear. Much of the success or failure of permissively licensed open source platform projects, then, will depend on how well they do or do not address compatibility questions for customers.

12 comments

It might be worth mentioning the ongoing compatibility program “Cloud Foundry Core” in place to ensure application portability/compatibility across hosted providers of Cloud Foundry. Don’t forget that ‘time’ based incompatibilities can also exist in any OSS project with newer versions vs. old also raising a classic compatibility problem. Cloud Foundry Core tries to help users with both:

I’d point out that in terms of the original blog entry, Sinclair’s laughable highlighting of the “fork me on Github” banner *of the Cloud Foundry DOCUMENTATION* was the lightning rod, demonstrating a fundamental lack of understanding of how forking works today, specifically w.r.t. Github and contributions. That’s why the “minor spat” broke out on Twitter, fundamentally. His updated post today – which tumbleweeds roll past as oxygen drains from the desert – attempts to rewrite history, rather than admitting that he used a terrible analogy in the first place. Oh, and I’d mention that it wasn’t just me that called Apprenda out, but more than one of our Cloud Foundry partners joined me (obvious disclaimer here – I’m developer advocate for the CF platform, so take my views with any pinches of sugar or salt you prefer).

I’m not at a level in the project where I can make policy pronouncements, but I do have a very strong engagement in the community. I’ll make a few remarks.

1. Cloud Foundry Core is an attempt to ensure that we do, as a community of Cloud Foundry users and vendors, stay together from an API compatibility perspective.

2. The vendors and partners I interact with daily are big fans of what we’re building and keen to understand how to retain that compatibility, no matter the FUD from our friends from rival platforms. I have great relationships with those vendors, partners, and implementers, and I’m happy to help them stay in lockstep with progress as appropriate.

3. The past month or two has seen a change in approach from the CF engineering team, and that will be evident to anyone following the relevant (public) mailing lists via Google Groups, the source code on Github, etc. It’s not just our outward approach – I’m personally feeling more connected to the engineering and product groups myself, than ever! We tried a Gerrit approach over the past ~10 months, and we’re moving to something more open right now. It’s adaptation, and I personally believe it is for the benefit of both the CF project, and the surrounding community.

4. I completely get the issues raised around OSS licensing, permissiveness etc. I loved Luis Villa’s post on this a month or so back http://tieguy.org/blog/2013/01/27/taking-post-open-source-seriously-as-a-statement-about-copyright-law/ – you know me – I’ve been a decade in IBM, and longer than that as part of OSS communities. I’ve worked with big businesses. I get it. Times change, and we as a (closed)|(open)|(inclusive)|(software) community need to adapt.

I don’t think this post really refutes Sinclair’s main point, which, as Andy points out in overly disparaging terms, he clarified today to indicate forking essentially into multiple distros. And in that context, he’s right.

Why is forking bad for the enterprise?

If the company wants to build their own PaaS to meet their exact requirements and wants to own their cloud application infrastructure, forking off an existing base is a great way to benefit from hard work and lessons learned by others; code reuse writ large. I totally agree with Stephen that the combination of distributed version control and permissive licenses make that a very viable option for developers and companies alike. But most enterprises aren’t looking to own their infrastructure at that level.

In the case of PaaS they’re looking for an adoptable (and adaptable) platform where they can host their applications and receive as much organizational value as possible. The true requirements rarely (if ever) dictate whether the platform is open-source code or proprietary code. (and when they do, it’s a political statement more than anything else). From a cost of operations and maintenance perspective, most companies will look to purchase a platform to run their cloud applications. That’s not a surprise to anyone, I hope.

For this reason (and others like ignoring embedded flavors), the relevant scope of Linux distributions is fewer than 10 rather than hundreds, which limits the strength of the forking argument in this scope.

While it’s definitely true that any CloudFoundry offering that is Core Compatible will support the same basic set of capabilities – and any application that limits itself to those capabilities should be able to be migrated, history has shown us that developers tend not to constrain themselves to the core capabilities when faced with a challenge.

I’m sure many of us have had to try to migrate JEE applications from WebSphere to WebLogic or vice versa. How many of those transitions were as seamless as the existene of a “standard” would have you believe?

The problem with forks is never with the core. It’s with the changes. And while each change is presumably beneficial, adopting them limits the portability of the applications.

If you map forked distros in a family tree (as in the diagram linked by Patrick Chanezon in the main post), once you pick a starting point on the tree, you can only move further down the branch, because you’re likely relying on features that only live on that branch. And every time you either select a platform, or decide to leverage a new capability, you’re restricting your choices even further.

Now, there are many reasons for a buyer to consciously make a decision to limit the scope of their future options. It’s a similar calculus to deciding to use closed-source software. If the value is greater than the cost, it’s a smart move.

Clearly there’s the possibility of good to come from forking, but there seems to be a distinct lack of acknowledgement that the choice to fork reduces real-life compatibility (as opposed to marketing compatibility) and that open-source is not a free lunch.

Disclaimer: Until January, I was Lead Architect for Customer Solutions at Apprenda, so my views do come tinted with Apprenda-colored glasses. But they’ve also been shaped by my experiences talking to customers searching for a PaaS and numerous application migrations between JEE servers.

[…] ball gazing into what this might actually mean for the project. In a well reasoned and articulate post, Stephen O’Grady of Redmonk fame (who, it needs to be admitted is contracted by many of the […]

[…] ball gazing into what this might actually mean for the project. In a well reasoned and articulate post, Stephen O’Grady of Redmonk fame (who, it needs to be admitted is contracted by many of the […]

[…] Cloud Foundry, Forking and the Future of Permissively Licensed Open Source Platforms With the cost of forking reduced or eliminated entirely, software development is parellelized; much as bacteria evolve more quickly because they iterate in peer to peer fashion, so too can software projects innovate along multiple parallel tracks rather than a single serial development path. DVCS-enabled forking, then, is an enormous step forward for software development. […]