Notes on open-sourcing abandoned code

Some people want a law that compels companies to release their source code for “abandoned software”, in the name of cybersecurity, so that customers who bought it can continue to patch bugs long after the seller has stopped supporting the product. This is a bad policy, for a number of reasons.

Code is Speech

First of all, code is speech. That was the argument why Phil Zimmerman could print the source code to PGP in a book, ship it overseas, and then have somebody scan the code back into a computer. Compelled speech is a violation of free speech. That was one of the arguments in the Apple vs. FBI case, where the FBI demanded that Apple write code for them, compelling speech.

Compelling the opening of previously closed source is compelled speech.

There might still be legal arguments that get away with it. After all state already compels some speech, such as warning labels, where is services a narrow, legitimate government interest. So the courts may allow it. Also, like many free-speech issues (e.g. the legality of hate-speech), people may legitimately disagree with the courts about what “is” legal and what “should” be legal.

But here’s the thing. What rights “should” be protected changes depending on what side you are on. Whether something deserves the protection of “free speech” depends upon whether the speaker is “us” or the speaker is “them”. If it’s “them”, then you’ll find all sorts of reasons why their speech is a special case, and what it doesn’t deserve protection.

That’s what’s happening here. The legitimate government purpose of “product safety” looms large, the “code is speech” doesn’t, because they hate closed-source code, and hate Microsoft in particular. The open-source community has been strong on “code is speech” when it applies to them, but weak when it applies to closed-source.

Define abandoned

What, precisely, does ‘abandoned’ mean? Consider Windows 3.1. Microsoft hasn’t sold it for decades. Yet, it’s not precisely abandoned either, because they still sell modern versions of Windows. Being forced to show even 30 year old source code would give competitors a significant advantage in creating Windows-compatible code like WINE.

When code is truly abandoned, such as when the vendor has gone out of business, chances are good they don’t have the original source code anyway. Thus, in order for this policy to have any effect, you’d have to force vendors to give a third-party escrow service a copy of their code whenever they release a new version of their product.

All the source code

And that is surprisingly hard and costly. Most companies do not precisely know what source code their products are based upon. Yes, technically, all the code is in that ZIP file they gave to the escrow service, but it doesn’t build. Essential build steps are missing, so that source code won’t compile. It’s like the dependency hell that many open-source products experience, such as downloading and installing two different versions of Python at different times during the build. Except, it’s a hundred times worse.

Often times building closed-source requires itself an obscure version of a closed-source tool that itself has been abandoned by its original vendor. You often times can’t even define which is the source code. For example, engine control units (ECUs) are Matlab code that compiles down to C, which is then integrated with other C code, all of which is (using a special compiler) is translated to C. Unless you have all these closed source products, some of which are no longer sold, the source-code to the ECU will not help you in patch bugs.

For small startups running fast, such as off Kickstarter, forcing them to escrow code that actually builds would force upon them an undue burden, harming innovation.

Binary patch and reversing

Then there is the issue of why you need the source code in the first place. Here’s the deal with binary exploits like buffer-overflows: if you know enough to exploit it, you know enough to patch it. Just add some binary code onto the end of the function the program that verifies the input, then replace where the vulnerability happens to a jump instruction to the new code.

I know this is possible and fairly trivial because I’ve done it myself. Indeed, one of the reason Microsoft has signed kernel components is specifically because they got tired of me patching the live kernel this way (and, almost sued me for reverse engineering their code in violation of their EULA).

Given the aforementioned difficulties in building software, this would be the easier option for third parties trying to fix bugs. The only reason closed-source companies don’t do this already is because they need to fix their products permanently anyway, which involves checking in the change into their source control systems and rebuilding.

Conclusion

So what we see here is that there is no compelling benefit to forcing vendors to release code for “abandoned” products, while at the same time, there are significant costs involved, not the least of which is a violation of the principle that “code is speech”.

It doesn’t exist as a serious proposal. It only exists as a way to support open-source advocacy and security advocacy. Both would gladly stomp on your rights and drive up costs in order to achieve their higher moral goal.

Bonus: so let’s say you decide that “Window XP” has been abandoned, which is exactly the intent of proponents. You think what would happen is that we (the open-source community) would then be able to continue to support WinXP and patch bugs.

But what we’d see instead is a lot more copies of WinXP floating around, with vulnerabilities, as people decided to use it instead of paying hundreds of dollars for a new Windows 10 license.

Indeed, part of the reason for Micrsoft abandoning WinXP is because it’s riddled with flaws that can’t practically be fixed, whereas the new features of Win10 fundamentally fixes them. Getting rid of SMBv1 is just one of many examples.