Open source software vs. the NSA

From a security point of view, the trouble with cloud-based applications and closed source software in general is that you can never tell whether there are flaws that will leak your information or even back doors put there deliberately to allow third parties to get at it.

Open source software gives you many advantages.

You can understand exactly what the software will do when run. Strictly speaking you can understand what any software does, but source code written in a high level language serves the purpose of both telling the computer what to do and telling humans what the program is intended to do. This is because classes, functions and variables in the program are given English names. Programmers may even write comments in the source code to annotate it. The names and comments may be misleading but this becomes apparent when you look at what code does as a whole. If you can not personally understand the program, you can be reasonably sure others do. One thing that gives me confidence is that previous flaws have been found and fixed.

You can be sure you are running the same software you have gone to the trouble of understanding because you can compile it yourself. You can compile the user applications, libraries, operating system kernel, drivers and even the compiler yourself if you want. More usually you will entrust most of this work to others such as Linux distributions. Programs downloaded from such sources are cryptographically signed. Becuase the source code is available anyone can check that the source code produces the same program that is provided pre-compiled.

So there is little likelihood of a back door in open source software. Linus’s Law states that many eyes make bugs shallow. This means that bugs in open source software, especially the most important and most widely used open source software, get fixed quickly. In The Cathedral and the Bazaar, Eric Raymond described how the Linux style of development leads to superior code quality. All this means there is less likelihood of accidental leakage of your secret information.

Should they decide they do not like us encrypting our files or obscuring our online activity, it would be very hard for authorites to take open source software away. The nearest they have got is the Consumer Broadband and Digital Television Promotion Act which was intended to protect music companies who wanted to put DRM into music by making trusted computing compulsory. The idea was that computers would be required to have a special chip that would only let them run programs that would be cryptographically signed by some authority. You would not be able to run your own programs.

The bill got nowhere and such laws are unlikely to because open source software is so ubiquitous. It runs the Internet. Samizdata runs on a computer running the Linux kernel using GNU libraries and uses an open source web server, database and blogging software written in languages compiled by open source compilers and interpreted by open source interpreters. So do everyone else’s web sites. Most of the electronic gadgets in the world that have any software at all have open source software in them, including phones and TVs. None of this is going away.

As much as Google and Microsoft have brands to protect, if the government makes laws big companies have to follow them. Governments have no such hold over open source programmers who are geographically, organisationally and ideologically dispersed.

It is possible that certain algorithms have mathematical back doors and that the NSA has hired all the people clever enough to find them. It is possible that the NSA tried this with a cryptographic random number generator and were caught out. We can be somewhat confident that the NSA can not break AES encryption. There are other encryption algorithms available.

Nothing is certain, but open source software gives us some control over our computers and some defense against governments that closed corporate software never can.

The back doors discussed in that article were “voluntarily” created at the server level. Open source does absolutely nothing about that. It doesn’t matter how much you can understand source code, it matters what’s actually installed and running on the server.

So you’ve got a FOSS client running on a FOSS OS connected to a server running a FOSS OS, a FOSS DB, a FOSS and a FOSS app. And the government tells the server admin he has to keep various connection logs and hand them over upon request. Same thing.

Dennis Ritchie (creator of the C programming language which pretty much all Unix/Linux/etc. systems are written in) hid a back door in the C compiler as an experiment, and made it practically undiscoverable. He modified a version of the login function to always allow him to log in as a superuser, regardless of any other system settings. Then he modified the C compiler to insert that code whenever it was compiling the login function, regardless of whether the code was present in the source file. He modified the C compiler likewise, I believe. Then he compiled both modules, and removed the source code. Thereafter, any time that compiler was used to compile either the login code or the C compiler, it re-inserted his code, making it self-perpetuating with no visible source.

Very sneaky, and the only real way around it would be to bootstrap your own C compiler from another language (or completely independent C compiler). It just shows that bad things can even be hidden in open source software, although it’s much harder than hiding things in proprietary software.

This is because classes, functions and variables in the program are given English names.

Which would make them incomprehensible to non-English speakers.

On the other hand, much software is written by non-English speakers. A friend of mine had to rewrite an accounting system in which all names and comments were in Urdu.

There are also some theorists who argue that “descriptive” names are bad – because the name may not be accurately descriptive. Comments may be wrong or out of date. Names and comments could also be deliberately deceptive.

Open source is a good thing – it exposes the actual workings of the software. But it’s not as simple as reading the names and comments.

And to be blunt, rigorously verifying the functions of a large software package is far beyond the capacity of any single person, much less an average user.

Dale: mywickr looks interesting. Just bear in mind that unless you can download the source code to the application and the server, compile them, and run the compiled app on your own iPhone and the compiled server software on your own servers, then you have to trust the mywickr company to some extent or another. I expect best case they assure you that your data is encrypted on their servers such that even they can’t read it. But you have to trust them about that too.

WDO: I’m talking about the ability to run your own software on your own computers. When you use GMail or whatever you are indeed trusting a closed system over which you have no control. Whether or not Google use open source software is irrelevant and I never claimed otherwise. If you want to do e.g. web mail in an open source way, you need to use an open source web mail program and compile and run it on your own servers. A quick search found Roundcube. Of course you have a new set of security problems if you do that…

Rich: I’ve worked with French people and they write their software in English. Pretty much all open source software is written in English. I think people just accept that they need passable English to write software.

Though I didn’t say it, I was actually comparing source code to disassembled machine code when I was talking about the usefulness of the English names of things. I was trying to explain why source code is important. The point of names is to reveal the intention behind the logic. It takes more than just that for code to be understandable. But *successful* open source software tends to be fairly understandable because otherwise no-one will want to work on it, so it will not be successful.

As to your last comment, I disagree, at least in principle. It is simply a matter of time. In general each sub-component needs to be understandable by one person otherwise they could not have written it, even if one person cannot comprehend the whole thing at once. So either you verify it piecemeal, or you have different people look at the different parts. This is in fact what happens. For example you will find if you try to submit a patch with a back-door in a part of the Linux kernel that the person who understands that part will notice very quickly and shout about it.

Ah, and I remember the backdoor someone almost got into Linux, it worked because some clever fellow got access, changed a ‘==’ to ‘=’ which assigned 0 (root) to a variable. The reason it was found was that Linus was using bitkeeper at the time and that bit of software kept a hash of the code and a difference in the hash was noted. But for that bit of good luck who knows what might have happened. One lesson learned was to never do ‘a == 0’, but rather ‘0 == a’ for such logic, another was the value of hashing the code.

Rob Fisher (Surrey) @ June 7, 2013 at 10:52 pm:Rich: I’ve worked with French people and they write their software in English. Pretty much all open source software is written in English. I think people just accept that they need passable English to write software.

That’s true now. But it will become steadily less true as IT penetrates further into non-Anglophone countries, especially China.

As to your last comment, I disagree, at least in principle.

In practice, one is dependent on the FOSS community to verify the software. That seems to work nearly all the time. I wonder how well it would work against organized subversion.

There’s nothing to prevent the NSA (or the FSB or the PLA) from placing “moles” in FOSS projects, covered as students, free-lancers, or volunteers. The “moles” would work on the projects, making useful contributions and establishing reputation. Over time they could gain control of a few key modules, and insert very carefully obfuscated backdoor code.

As long as the package does everything it’s supposed to do correctly, who’s going to notice it does something else too?

Almost all the effort in open source goes into coding. Very little into making its use comprehensible to non-coders. None at all into making it available as a consumer product. It is for geeks by geeks. However much IT has grown in the last generation, that’s still a minority among even educated people. The “we” you are talking about can hide a bit, and live secret lives, but have little political power, and excludes a very large number of people who might want to be free, but cannot understand how.

I’m reminded of Farenheit 451. Some books survive, nominally, by moving them into individual human memory, but not a literate culture. Or of priest-holes. Or the secret fire-chambers in Zoroastrian temples in the country areas of Iran.

That we is not me, in particular. If you are going to crawl away into the open-source catacombs, then maybe you could leave me a map I can read.

Nonsense. 99.999… % of the people on planet earth simply do not have the skills to decipher even small program, while at the other end of the spectrum, lack of source code is no barrier to understanding the internal logic of a piece of software.

Those 99.9… % are in no different position with open source than with closed, they have to take someone’s word for it, to decide who they trust. They may sensibly decide that the interests of, for example, Microsoft, align more closely with theirs than, for example, Richard Stallman’s.

Chuck in the fact that Google, Facebook, Twitter et al are built on open source software, which won’t magically protect them from FISA requests, and I’m afraid the whole ‘open source will save us from oppression’ argument is a dud. Ridding ourselves of the oppressors will save us from oppression, not our choice of software.

It’s possible to store your data in the cloud if it is encrypted locally first. If the service provider can not read your data then they can’t give it to the government.

This is possible, though it does of course introduce some limitations. An example of it being done pretty much perfectly is the password manager Clipperz. You can download the code and run it yourself. All decryption is done on your local machine.

Your data stored (i.e. available) on the web should be as secure as it can be. Whether or not that includes photos of friends, dogs, etc. is your choice; do you think it may be used by some unscrupulous so-and-so (being the gov. or even an identity thief, for example) against you at a later date? I still think Facebook et al is a timebomb, but I AM paranoid. 🙂

If you mainly use a portable device it follows that there will be more risk of it being stolen, so I would say it makes sense to have personal or sensitive data encrypted (the government officials that keep leaving laptops on buses should be paying attention at this point!).

Home machines are less at risk of robbery, but from a malware perspective it’s probably only a matter of time before some cleverclogs devises a method of stealing from us all.

If you do take the encryption route, you also have to take into consideration the safety, duplication and backup of your keys, otherwise you can’t get at your own data; that’s another stumbling block for the masses!

Another thing you may want to consider is keeping yourself under the radar. It’s one thing to stick it to the man by encrypting all your emails, but are you going to draw unnecessary attention to yourself in doing so?
GPG (that Rob mentioned above), as well as keeping your emails private, also has the purpose of digitally signing your messages, so that no-one can send a message pretending to be you if they haven’t got your private key. If this was chosen as the de facto standard for emails, you get proof of sender AND you won’t stand out from the crowd, as everyone’s mail would be encrypted. But Big Brother wouldn’t like that.

You may also be interested to know that ALL encryption can be broken. It’s just a matter of resources and time. The aim of security experts is to make it take longer to break than the data’s useful lifespan.

Stallman raises an interesting point about ownership of data you put out on another server (ie, Facebook) and he also sees most “cloud computing” as a hook to lock customers into proprietary access software.

I imagine most cloud computer storage and access systems have unilateral surrender of some or all rights buried down deep in the TOS, possibly incomprehensibly worded.

That’s a good point. You would hope, however, that they are still using known encryption algorithms – why isn’t that spelled with a ‘y’? – and not proprietary ones. Bruce Schneier says public ones are well tested and the “security by obscurity” concept often doesn’t work in relation to in-house algorithm designs.
There’s also the fact that a known algorithm should easily be available to get your data back if you need to (as long as the key is known). If a proprietary algorithm is used and the company goes under there’s no guarantee you’ll be able to access the data at a later date.

Actually, I have not stated my questioned clearly enough: my point was that I got the impression that some commenters here (including you) think that cloud storage has at least some security advantages over the “physical” option, and that you may have implied that encryption would make it even safer. This would seem at least counter-intuitive to me (although obviously not the encryption part), which is the main reason I am for the most part keeping away from clouds.

It’s like most things: you have to choose which set of problems you have. Who do you trust most to keep your data files secret? You might think it is yourself, but do you really know what you are doing? You might put your files on a computer that is not kept up to date with the latest security fixes. You might connect to an open wireless access point and someone might use a known exploit to access files on your computer. Or your hard disk might fail and you find yourself without a working backup.

Or you hire some cloud service provider to take care of these details for you. They may know more about keeping data secure and backed up than you. But *now* we know that the NSA forces them to hand over your data without telling you.

So what to do? *Either* you start taking care of things yourself, and using open source software on computers you own is one way to go. Or we think of ways to make it so that cloud service providers are unable to hand data over to the NSA because they simply don’t have the keys. Hence the discussion about local encryption.

There is a famous quote regarding digital data (sadly I can’t remember by whom), which states that if your data does not exist in THREE different places it essentially doesn’t exist at all. In other words, if you don’t have it backed up carefully, before you know it *poof* it’s gone. This includes storing it at more than one physical location (in case your house burns down, for example).

I am sceptical about handing my information on to anyone else, but I can see how it would be easier to pop say, the next few pages of a book you’re writing onto some online storage, instead of having to take it to a relative’s house! It could become tiresome on a daily basis.

Who Are We?

The Samizdata people are a bunch of sinister and heavily armed globalist illuminati who seek to infect the entire world with the values of personal liberty and several property. Amongst our many crimes is a sense of humour and the intermittent use of British spelling.