I’m a Software Developer and Software Architect – and I’ve seen a huge amount of stupid bugs in code. But this time I see a bug in human behavior. Some German students of “The Center for IT-Security, Privacy, and Accountability” at the Saarland University discovered that thousands of databases from different companies are directly accessible from the internet (you should have a look at the original paper here: http://cispa.saarland/mongodb/). They did this research for MongoDB, but it seems that there are even more databases out there with critical data (like social security numbers and credit card information) that are currently accessible from the internet.

A default installation of MongoDB is without external access and without a password. “Partly secure” is worse than “not secure”, because many users think “partly secure” is equal to “secure enough”. In this case it’s only one single configuration value what makes the difference between “secure” and “open for everyone”. And when you want to split application server and database server, it’s really simple to just enable remote access.

CISPA does think that “… that responsible for these leaks is not MongoDB Inc. but the MongoDB users, who have insecurely configured the open source software…” – but I think this is not only the users fault. One design principle of software (whether it’s meant to be accessible from the internet or not) must be “secure defaults”. You should NEVER design your software in a way that it’s easy to be setup insecure. Especially with databases Microsoft did had a hard time realizing that the “sa” account without a password is a bad idea. They did come to this conclusion many years ago … but some other database provider still seem to think that developers and administrators are not clever enough to setup a password while setting up a database service. They don’t want to add the complexity to type in and remember (or document) the password to access a service while convincing the developer that using MongoDB (or other database engines) is a clever choice, because it’s so simple to setup.

Yes, it’s complete nonsense to open port 27017 for traffic from the internet into the DMZ or even worse into the corporate network. And yes, as an administrator you should ALWAYS become familiar with the products you allow in your network (being a “good” administrator is not about 8h a day looking at SCOM – it’s about constantly learning about the tools your users/developers do need and use). But in my opinion, it’s also up to the vendor of a tool to make it easy to use it in a secure way and to make it hard to use it in an insecure way. And in my opinion it’s complete nonsense to allow remote access to a “database” without any need of authentication – prohibiting such a nonsense must be done by the vendor in code and not only in documentation.

Recently I saw a talk about “Railway oriented Programming” by Scott Wlaschin. While he talked about functional programming style to pass the result and the failure from one function to another he did use F# to show a “sum type” (Wikipedia [http://en.wikipedia.org/wiki/Tagged_union] describes it as ”a data structure used to hold a value that could take on several different, but fixed types. Only one of the types can be in use at any one time, and a tag field explicitly indicates which one is in use”).
I thought that would be a very nice thing to have in C#, too – especially with LINQ-statements that are passing IEnumerable around.
I thought about a concrete but simple use case for such a processing and came to a simple array of “string”: { “1”, “12”, “123”, “1234”, null, “12345” }. I want to calculate “100 / (value.Length – 2)” from these strings in a way that I can filter the results that did throw exceptions (in this case “12” will throw a DivideByZeroException) and process all successful calculates value.
The result of some extension methods and some classes is the syntax presented in this unit test:

[TestMethod]
public void PerformingActionsWhileCalculating()
{
var logger = new ExceptionLogger();
var counter = new ExecutionCounter();
var res1 = new[] { "1", "12", "123", "1234", null, "12345" }
.WithAction<string, int>(counter.Action2) // this is not optimal, since we need to specify types here
.WithAction<string, int>(counter.Action3) // this is not optimal, since we need to specify types here
.Skip(1)
.Try(Calculate) // here the value is calculated while calling the two actions defined above
.WhenIs(logger.HandleException)
.ToArray();
Assert.AreEqual(5, res1.Count()); // we should have 6 return values
Assert.AreEqual(5, counter.Count); // we have only 6 executions of the calculation
Assert.AreEqual(2, logger.List.Count()); // out list of logs contains two exceptions
Assert.AreEqual(100, (int)res1[1]); // check for a successfull result
}

As you can see, the “Select” has been replaced with a “Try” which calls “Calculate” for each member of the array and returns an “IEnumerable<Either<Exception, TRight>>”. “Either” is a struct for values that can have one of two types (in this case either “TRight” or “Exception”).

Then I call “WhenIs” with the method “HandleException” of “ExceptionLogger”. This method accepts an argument of type “Exception”, so the compiler can use type inference to determine the type to check for. Since the “Either” is either of type “TRight” (in this example the calculation returns an “int”) or an “Exception”, it also can be a “DivideByZeroException” – and that’s what we get for the first element we are passing to “Calculate” (just as a reminder: we skip the first element of the array, so the first element passed to “Calculate” is “12” – calculate divides by the length of the string minus two, what is “0” for the string “12”).
Since the extension method “WithAction” does not simply execute the function provided, but wraps the value together with the function into a new object, the “Skip(1)” will prevent one execution of the function and the function can do things like measuring the duration of the calculation.
I hope you are a bit curious about the implementation, which can be found on GitHub (https://github.com/Interface007/FuncLib). I’m not sure if this implementation is really useful, but I had a lot of fun playing with expressions, generic types, functions of functions, implicit operators etc. It’s like running in nature – you don’t get anything done … but it’s fun.

If you find an interesting problem for this solution, just drop me a comment or a mail – would be very nice to know the code does help somebody to solve a real problem.

There still seem to be many people assuming “using SSL” or “keeping the default password secret” to be effective counter measures against becoming compromized. While this is partly true for SSL (correctly implemented this will protect your datas confidentiality and integrety), default passwords are “evil by default”. But you have to take care about every potential attack – and you should take care about the OWASP Top 10.You might see examples of this in two recent incidents:

Twitters TweetDeck did not implement proper output encoding and by simply rendering the data without removing the JavaScript, it did contain a XSS vulnerability – resulting in about 30000 people tweeting pink hearts … who knows how many times this vulnerability has been exploited before?

As a developer it should be clear that you need a basic understanding about SSL, authentication methods and encryption, but to produce secure software you also need to understand the risks. Just applying some random counter measures is not what makes your application secure, you need to understand the things that may make your application in-secure – things like default passwords (which are a bad idea by default), XSS, CSRF, injection, etc.

HeartBleed did hit the internet community on 201-04-07. Many bloggers already did explain the bug and some of the consequences (most of that information can be read on a web site exclusively created for this bug – how many bugs have an own homepage?). Some also discussed why such a bug can happen (missing check of input parameters, internal implementation of memory management, …). But what most of them do not describe is what we can learn from that.

It’s easy to point at the line of code that has caused this issue and say “that’s clearly wrong – you should have checked the data”, but that’s not the point. C++ does not check your memory management as well as C# does, so you always will do mistakes and it’s pointless to complain about a specific single error (that has been undetected for years). But what we should do is try to find ways to prevent such problems in our code. So have a look at the bug fix and think about what was helping the bug to become true.

You can look up the commit for the fix in GitHub – it’s a fairly easy fix, but it reveals a really ubiquitous issue: trust in caller provided information (I don’t blame C++ for such a lousy support of secure memory management – it’s an old language without a focus on preventing the developer to make mistakes … it simply assumes that you as a human never do a mistake). In this case the length of the block the caller wants to get back has been provided by the caller. In my opinion this is a major design flaw. The caller should not be allowed to determine the length of a response – it can provide some data about the maximum size of content you “should” return, but you NEVER should simply use that number.

So is the bug relevant to C#-Developers? Yes definitely: it shows that trust in external data is a really bad idea – and that does have nothing to do with C++, SSL or OpenSource … it’s a really basic security concern: do not trust the data in your methods parameters. The fix in OpenSSL is now comparing the desired length with the length of the buffer and using the minimum of both. I would recommend to not even do that: simply return the same data you got – that way there would be no need for a user specified length that might be wrong.

I tried to resist writing this – not because I think it’s not worth to mention, but because I don’t really like Apple … and because I’m biased, I don’t want to talk bad about them. But I can’t ignore that one of most emerging software enterprises did fail in a really bad way.

Even without deep knowledge of C++, you should be able to see the duplicated “goto fail;” line – that’s why it’s called the “goto fail bug”, now.

I don’t blame the developer to accidentally duplicate a line of code (shit happens), but I really have an issue with the fact that this bug has not been detected by a unit test. The whole purpose of the method is to check an SSL certificate and the call to “final” is essential to test the certificate. Unfortunately (because of the duplication) this line cannot be reached under ANY circumstances.

There are 4 “things” that should have detected the bug before it went to production:

Compiler warnings: in C#, when you compile some code, the compiler will complain about “unreachable code”. You have to actively insert compiler directives to silence that warning. But may be the compiler Apple uses does not have such a clever compiler – I don’t know.

Unit tests a): someone must have been inserted the line with the call to SSLHashSHA1.final(). And this person must have done that intentionally – it’s an additional and really important check to make sure the certificate is valid. So: Why the hell is there no unit test with an invalid certificate that tests for exactly this condition?

Unit tests b): with unit tests you usually have something called “code coverage”, so you can see what code is tested and what is not. When there is code that has not been tested in a unit test, there should be someone who does have a look if that is ok – that person would clearly see the issue and would be able to fix it.

Manual testing: obviously there is no manual test for a web page that does contain a certificate with a manipulated private key.

So my issue with this bug is NOT that someone did a mistake by duplicating one single line of code – as I already said: shit happens. But it’s really, really bad to not find that kind of issue in a security related area when there are 4 different checks/tests that should have been done and each of them would identify that problem. I really don’t want to think about all the other issues that might still be there inside that code, if they don’t fix compiler warnings, don’t try to write unit tests with high code coverage and they don’t test the scenarios that code should prevent.

There are also some “NSA conspiracy theories” out there, but: “Don’t assume malice when stupidity is an adequate explanation”.

What should you learn from this?

Never ignore compiler warnings – review your code and fix them!

Write Unit-Tests for your code, and don’t stop at 100% coverage – you should evaluate if your tests do include the use cases (and misuse cases) you did write the code for.

Just forwarded this interesting video to another fellow worker, where Daphne Koller (co-founder of the MOOC platform Coursera), discusses the new opportunities lying in the data generated by MOOCs. For users a platform like this does contain tons of information and knowledge – for the people providing the content and the platform, the users offer a huge amount of data, too. It’s not just the name, address, career level and interests of the users, but also (with adding some slightly different material) the opportunity to see whether something like showing up the teacher the whole time inside the video does help users to get better results or if it does disturb the user while learing.

I’m currently on the following platforms:

Coursera – a platform that partners with universities worldwide, and offers courses online for free