Recently in IT Category

Part of what I do for real life is managing several version control servers, including Gerrit and Gitolite. Now, this in itself has its own fun problems, but recently, in the midst of my routine auditing, I saw a large number of people using the server on a holiday weekend. This had me thinking, who works on a holiday? Even more interesting, the requests appeared to be at regular intervals, which led me to think it was some automated process. I asked both users what was going on, and they explained they were using Atlassian SourceTree (a rather beautiful Git frontend, I might add). Apparently, it defaults to updating its repositories every 10 minutes.

Now, in a centralized VCS, you might ping the status of your files and see if you're outdated (and need to update before you commit), but with distributed version control, you're actually pulling down the latest and greatest code all the time (all without changing your working tree). In a busy repository, this can show when you're working on an outdated file - maybe pretty useful. But, at the same time, I think there may be a few negative sides that need to be considered.

Code Churn

Yep. If your repository is really so active you need to know where it is at all times, you're going to spend a lot of your time needlessly rebasing your change to keep it at the tip of development. Now, that's not to say you shouldn't do this sometime, but particularly when I've been writing a feature in an active repository, I write it on an isolated version that I know is working well for what I need, test everything, then when my feature is complete, I can update and test again. If it's broken now or I have a merge conflict, I have two points I can easily bisect and fix. By the same token, this is the reason why Git makes it so simple to branch and merge: you don't need to me working alongside everyone else - merging can be a much more coordinated effort, but when developing a feature, it can be done in isolation (for the most part). If you join the camp of Git Rebasers, they would say this is messy and you should always rebase your change (for a more detailed discussion, Git team workflows: merge or rebase? is a good explanation on the two sides). Personally, if I'm doing a simple change that doesn't require an independent feature branch, I may write my change, and before submitting to review on Gerrit, I'll rebase once just to make sure I'm in sync with the world and there are no conflicts before other developers spend time reviewing it. There, now I don't need to worry about a bunch of churn from my side, and I didn't ping the server continuously.

Destructive History

This is something I have to worry about less: we intentionally block all operations via Gerrit and Gitolite permissions that would be destructive to history. But, there have been a number of documented cases where users have inadvertently reverted months of history through a simple force push. For a few examples, Git Patterns and Anti-Patterns documents the specific case where Eclipse accidentally wiped out some history. Ironically, Luca Milanesio (the author of the Git Patterns and Anti-Patterns refsheet) accidentally force-pushed 186 Jenkins plugin repositories on GitHub. The force-push is the enemy of automatic updates, since your remote would be rewound to the point that the force-push took it, and you'd have no way of knowing the rewind happened (aside from looking at the reflog, but see the note under "Pattern #9" on the patterns/anti-patterns). In either of these cases, if you happened to fetch manually, at least you'd see that a forced-rewind happened, and you'd have a chance to investigate. In a scenario where data has been lost, sometimes a developer's repository may be useful in disaster recovery - after all, a Git repository is a Git repository. That said, you should still have other disaster recovery methods in place, but it's always good to have other options, too.

Conclusions

At the end of the day, is the automatic update feature useful? I don't think so. While there are some limited scenarios it could make things better, I also see a whole lot of cases where it can do more harm than good, even though the incremental fetches are rather inexpensive from a server perspective. Interestingly, I've pulled myself out of several pinches by looking at a Gerrit replication mirror and restoring content from that. At the same time, deletes are never pushed to the mirror (only new content), and non-fast-forwards are rejected. As a developer, you should probably not need to anticipate some of these "worst case" scenarios, but the time lost by rebasing your change indefinitely is real, and I would advise against even that unless you need to. Gerrit has options to resolve conflicts if it can (i.e. trivial rebase), or you may not even need to rebase at all. The tools like Gerrit and Jenkins are there to make your life easier by worrying about the code compiling and merging so you don't have to - second guessing them by posting rebases all the time can only waste your time (and CI time).

Well, Skype has a fun little security vulnerability, and despite trying to take every action possible to report it to Skype/Microsoft properly, they just don't seem to listen too well. I have seen this vulnerability exploited in the wild, so I really don't feel too bad about disclosing the details. It's also not like it's anything super-serious like the ability to run malicious code remotely, but it is likely to confuse new users.

The issue is, Skype processes Unicode. Perhaps a bit too well. In a file transfer, Skype interprets the Unicode "Right to Left Override" character in a filename. This can be exploited by an attacker to hide the true extension of a file. In the specific case I observed, it was a filename like Screenshot1234<RTL>gnp.scr, which the client displayed to users as Screenshot1234rcs.png. Looks like an image, to the untrained eye. Of course, when viewing the filename in Mac OS X, the RTL override character is not interpreted and instead replaced with a substitution character ("?").

I really don't expect much from this issue. Just make sure users are aware of the true extensions of files when being sent a file. Similar types of issues are the IDN Homograph Attack where attackers utilize similar characters in the Unicode character set along with the IDN system to confuse users about the site they're visiting, and the SSL Certificate Null Character Attack, where a null character is used to obscure the actual domain and confuse users about the real site they're visiting. Both of these aren't by themselves a huge risk, but by confusing a legitimate (but perhaps uninformed/unaware) user, they can be leveraged for much greater attacks.

Followup: Skype did eventually understand that this wasn't a support request, so maybe they're working on it. Also, the way the Skype clients function, the "is sending a file" messages work is the client is actually sending the text to the other user with their form of the "/me" command. So, it's still strange and I don't like it, but who knows what can be done, especially if a remote user can send whatever they want there?

Let me preface this by saying I'm not a code review expert. I just happen to have experienced everything from a very formal code review meeting to a very informal peer-review system. And I like code reviews. No, I love them. They really do help produce better code overall, and also help developers familiarize themselves with the code. Done properly, it will end up saving the company significant amounts of money and developer time by fixing errors before they make it into production.

So, here's what I've seen. Formal reviews that are checklists and formal meetings are time consuming, but are also very thorough--something that's helpful when you have a detailed set of requirements. Slightly less formal peer audits can be useful, but there are a few problems I've noticed. First of all, depending on who reviews the code, different issues tend to be caught. Theoretically, everyone should be working out of the same guidelines, but different reviewers pick up different things. That's human nature. Also, one book on developing code review standards made a very good point: don't sweat the small stuff. Depending on the circumstances, "small stuff" may have different meanings to different people. For one person, spaces vs. tabs may not be relevant, for others, maybe it's eliminating the requirement for Hungarian notation. In either case, you need to figure out what works well for you.

Now, here are some of the problems I've seen: the fact that reviews are completed by different people means some people will review a change and approve it with no comments, where others will pick the code to the point where there's nothing left. Even more amusingly are getting comments that are completely unhelpful or wrong on working code, while other code is being checked in that won't even compile. This all boils down to the absence of one particularly vital role: the moderator. The moderator is an independent third party whose job it is to check the comments, ensure they are all applicable, and also ensure all concerns are properly addressed before the review passes. By placing all the power in a single reviewer, you have destroyed the system of checks and balances that would exist otherwise. If the reviewer doesn't like your code for some reason, sorry, but not today.

The most effective system I have seen so far was a rather informal system using Atlassian Crucible (yes, I really was pleased with their product), but where reviewers were randomly assigned to the review. Then, a few of the reviewers (a minimum of two) would look over for defects and suggestions. Only defects (such as security issues or gross violations of the standards) needed to be corrected, and suggestions for improvement were left up to the discretion of the programmer. A moderator would observe the comments, decide on which ones were actually necessary to fix, and overall steer the discussion in productive ways. After reaching the minimum number of reviews, the review would be closed with a simple pass/fail status, and re-review would be required for failed status. The moderator would then ensure all comments were appropriately addressed in future reviews.

Basically, using this method, reviews were both fast and simple, and didn't waste a large amount of time. At the same time, the comments out of a review were, on the whole, helpful, and provided a forum for a developer to make their point if they disagreed with some of the comments. Take this for what you will, but I have found moderated informal pass-arounds to be the most effective reviews.

So, this is something pretty spiffy, and a lot of people have set it up already, provided nice walkthroughs, etc. Since there are many good walkthroughs already (like here or here), I won't go into too much detail. It's stunningly simple to set up, if you don't mind manually pulling the module's source out of version control and building it yourself.

But, one thing struck me as necessary. I have users that don't use two-factor auth right now, and they may not want to. So, how can I make this optional? Well, PAM makes this pretty painless, although it's not built into the module itself:

That's it. Basically, you have a group "secure," that if users are in, they will be required to use two-factor auth. The way it works is by skipping the following rule if users are not in the "secure" group, but ignoring the result of pamsucceedif if they are. Otherwise, they aren't. Get it? Now, this does mean that users need to be manually added to this group in order to be graced with two-factor authentication, but it's still better than nothing. If you're curious about a related enhancement, this bug report details some additional thoughts on transitioning to two-factor auth.