Latest GitHub incidents show Git History Rewrite is both Day and Night for Enterprise Compliance

It did not take a full day before GitHub had to disable its recently announced source code search feature again. The reason for that is perfectly explained in this blog post on arstechnica. In a nutshell, many users had uploaded Git commits to public GitHub repositories which contained confidential information, including their private keys (cryptographic credentials). Everybody who knows how to search for those private keys could use GitHub’s code search and log into major production sites. This is definitely not GitHub’s fault but a critical oversight of the repository owners who uploaded confidential credentials to a public repository. You would not blame your local newspaper if somebody was posting their credit card numbers and SSN either. GitHub reacted very responsibly by quickly disabling the code search feature upon that news.

GitHub’s incident made me think how we would handle a situation like this. CollabNet TeamForge supports code search as well, for CVS, Subversion and Git. However, our code search only displays search results users are authorized for. Access rights (both read and write) can be defined on branch level (Git) or path level (Subversion) within the same TeamForge project role. So, if somebody accidentally uploaded their credentials into a TeamForge repository, only a very limited number of users would be able to see them.

However, what should you do once they discovered that confidential information was accidentally stored in a Git repository? Just removing the problematic files in the next commit does not help as everybody could still go back one commit in the history and would see the problematic content again. We would actually recommend the same steps GitHub recommended: First, changing your credentials, then rewriting Git history with Git’s filter-branch command. In a nutshell, all existing Git commits of your branch would be cloned, files with a certain file pattern removed and recreated again. Obviously, this will change the check sums (SHA-1) of your commits, as you are changing what has happened in the past. As a consequence, if you try to push your new commits to the Git server, it would complain about that as you are trying to change already written history. However, there is a force option for git push that lets you do exactly that – rewriting history, pretending that certain changes never happened.

While you have just seen a very legitimate use case for this feature – removing confidential or problematic content accidentally pushed – it should ring all your alarm bells from an audit/regulatory compliance point of view: Git allows you to just change what the central repository knows about Git history. This feature can be abused to cover traces of unlawful behavior, introduce backdoors, discredit coworkers, etc. In the worst case, it might bring you into jail if you have to follow SOX, Basel or FDA standards.

Now we have a really delicate situation: Git History Rewrite is needed to remove unlawful material or confidential information, IOW to comply with your auditing standards and other legal necessities and at the very same time, the feature can be abused to get you into even worse audit trouble: It is both day and night for enterprise compliance. Fortunately, CollabNet’s Git Integration ships with a feature called “History Protection”. We recently blogged about this one already but let me shortly recap of what it does: Whenever we detect a history rewrite attempt at our servers, we won’t block it (given necessary permissions are granted in TeamForge). We know that there is a number of legitimate use cases out there (see above) why blocking is not an answer. Instead, we will log exactly what happened (who, when, what) in our tamper proof audit log, notify Git administrators and provide an ability for users of the repository to restore the previous content in a self-service manner (your administrators do not have to get involved). If needed, Git administrators can still permanently remove selected content (like accidentally pushed credentials or unlawful content) at a push of a button. With TeamForge History Protection, our customers can use Git’s history rewrite features with an audit compliant safety net. The feature is available both for hosted customers as well as on-premise.

If you like to find out more about CollabNet TeamForge’s History Protection feature or our Git integration in general, please visit our web site or watch an on demand webinar. We also provide a virtual appliance for free up to ten users. If you have any specific questions or feedback on this blog post, feel free to drop a comment.

Johannes Nicolai is CollabNet’s Development Manager leading all Git and Gerrit related development efforts. Furthermore, he is responsible for CollabNet Connect /synch, CollabNet’s platform to integrate TeamForge with third party ALM platforms. Johannes holds a Master of Science in IT Systems Engineering from Hasso Plattner Institut Potsdam and is a Certified Scrum Master. Before joining CollabNet five years ago, he was doing consulting on user centric design, developing cryptographic software and architecting SAP integrations. He is an Open Source enthusiast and contributes to many projects (check out https://www.ohloh.net/accounts/10619 for details).