How One Line of Code Forced a Full OS Release

It’s not often that Apple ‘fesses up about a bug, but the release iOS 7.0.6 came about because somehow an extra line of code got added to a function and broke it. Normally a bug would probably not warrant a complete release, but unfortunately this one-liner broke SSL connection verification. This is the code that checks that the connection is talking to the correct destination and uses SSL and TLS to verify it.

You can take a look at the source code of the function on Apple’s open source website. This is in C, by the way, not objective-C. It’s in the function SSLVerifySignedServerKeyExchange, which starts at line 575. Select the whole webpage and copy it into a text editor that shows lines. Skip down to lines 607 and 608, which look like this:

Now imagine if line 608 was accidentally duplicated. That was the bug. Just an extra line: goto fail;. This time, of course, there was no error condition to test so it always jumped to the fail label. At this point the variable err (of type OSStatus) would have a zero value, so although the label is meant to be jumped to only when a test fails, there was no failure to kick it off.

In C, though, zero is often taken as false and non-zero as true. The value zero is quite often used to mean success while the non-zero value holds the actual error number. That’s the case here. So the extra goto meant the code skipped a number of other tests and jumped to the fail: label at line 648.

This frees up memory allocated for two blocks of type SSLBuffer and returns err, which has 0 in it and skips all the checks from lines 610-646. The concern that connection verification returns success after only doing a few checks is what led Apple to issue a new version of iOS 7 (and iOS 6 for iPod Touch).

How Could This Happen?

There’s a little bit of paranoia that because the NSA knew of vulnerability soon after an earlier iOS 6 release, it must have been instrumental in creating it. How do we know that wasn’t the case? I suspect that Apple tests every new release with some clever software looking for vulnerabilities and that’s how they pick them up so quickly. Of course, we’ll probably never know the truth.

More likely is that (a) it was a simple cut-and-paste mistake or (b) merging a branch failed. That seems likeliest. It’s believed that Apple uses Git and/or Subversion version control systems as both are supported in Xcode. I know that errors of this type can occur when doing merges in Subversion because another developer broke some code of mine last week doing a merge.

Merges are used when, for example, you make a change in an earlier version of code — say to fix a bug — and you want later versions to also have the change. Typically you develop in the main branch (known as trunk) of your code, then label all files in the release. Anybody can get back to that release’s version of the files — even after many years and hundreds or thousands of changes — because the files were labeled. But a fix that has to be in that version won’t be automatically merged into later releases or the main branch, so you’ll have to manually merge it. That’s when things can go wrong.

Conclusion

A slightly improved code guide style (or adhering to it) might have caught this, as might static analysis which would show that the code after line 608 could never be reached. Guide styles for C-like languages sometimes prescribe that after an if, while or do statement there should be a block, i.e., starting with { and never a single statement.

But nobody would add an entire extra block, or even merge an entire extra block, because it’s OBVIOUSLY wrong. The problem with the actual problem is that it’s SUBTLY wrong. The indentation, which is just eye-candy and not syntactical in C, leads you to believe that it’s part of the if condition. This is why so many organizations just outright ban “lone statements” following if in C and C++, because it’s known to be a haven for stupid errors like this one.

I appreciate David’s presentation of the problem, but I think a pithier conclusion is in order.
I’m not going to diff the relevant versions of sslKeyExchange.c, but I see 2 cases in this scenario concerning the time of merge:

Case 1. Relatively minor changes in this area of code: I’d get very suspicious, very fast of Subversion’s merge facility.

Case 2. Relatively major changes in this area of code: Post-merge testing was insufficient for this scope of change.

I also agree with Unca’s comment about blocking. Embedded blocks that are unassociated (i.e. not associated with loops, conditions, function definitions, etc.) as in Unca’s example are a totally legitimate construct. The main use I’ve found for them is for multiline macro definitions, such as this:

#define COMPLEX_MACRO( x, y) \
{ \
x += 1; \
y += 1; \
}

int f1(void)
{
…
if ( a == b )
COMPLEX_MACRO( x, y);
else
…
}

int f2(void)
{
if ( a == b )
…

else
COMPLEX_MACRO( x, y);
}

compilation would fail for f1.
f2 would compile and misbehave at runtime.

An argument might be, “Well, just use curly brackets for all if/else blocks.”

That’s fine, but
(1) this style must be enforced initial development and at each code update — either manually or with some style-enforcement tool that’s part of development process, and

(2) In an large, established code base, the typical desire is to change only as much code as is necessary.

As I can’t edit my post, i’ll post an afterthought after some additional reflection.

Subversion cannot be necessarily blameless in case 2.
I’m used to a merge manager that recognizes the difference between easy and challenging merges, and brings challenging merges to the attention of the person running the merge.

So, my revision is:

Case 2a: Subversion did not recognize a risky merge correctly, so the person running the merge never got involved. This is a strike against Subversion.

Case 2b. Subversion did recognize a risky merge, and the person running the merge was notified to review the merge, and that person missed the duplicated line. This falls under human error.

Neither of these cases would excuse the developers/integrators from their responsibility to identify testing appropriate to the degree of code change.