What are the security implications of the compromise of kernel.org on the trustworthiness of the code base hosted on the site’s Git repository? The announcement today explained the mitigations provided by 160 bit hashes, site mirroring, and the distributed nature of the development effort. Are those reassurances solid considering the hashes for the main Git repository are collocated with the code they hash and typical usage/mirroring in light of the fact that the site was compromised for 17 days prior to detection? Finally, what improvements, if any, could be suggested based on this event?

2 Answers
2

Well, in regards to the use of Git, it turns out that the software and distributed nature does a lot to protect code and detect malicious modification. For starters, no old commits can be altered unless they match the corresponding SHA-1 hash. Even if one commit did match a certain SHA-1, it would affect the child SHA-1 hashes, so that's out. Nothing older than the date of the breach could reasonably be altered or it would break the repository and syncing.

Commits could be added in that 17 day period. Because timestamps are part of the SHA-1 hash for a commit, we can narrow down the list of "suspect" commits by looking at anything older than the last commits before the breach or anything where the child date of the commit is younger than the parent date. If you've ever tried to alter a commit in git after it reaches a shared repository, you understand how severely obvious it is that something historical was altered.

Because full copies of the kernel hash that contain full history are so widely distributed and are always incrementally updated by following a chain of commits that can't be rolled backwards without the user being aware, the kernel source code is actually remarkably safe even when the hosting website is compromised.

Finally, because only a limited number of individuals are able to legitimately commit to the kernel, we can expect that they would be able to note any updates to their sections that they don't expect. The human controls of sign-offs (kind of a technical control, but kind of not) plays a big part in knowing where the content came from.

All in all, compromising the kernel source in an undetected manner while it is in Git would be one of the more impressive hacks of the decade. One would have an easier time slipping in code via proper channels that looks innocuous but is malicious.