Alex Reads - MS Research - Cohesive and Isolated Development with Branches

Saturday, January 14, 2012

Post-read thoughts -
This paper seems like it was written by amateurs. Note that I am not a member of the academic community, nor do I write academic papers, so this is more of a comment on their writing style and their ability to defeat my BS filter (i.e. Can you prove that? How exactly do you define 'x'?).
Having said that, there are some useful ideas and interesting results from their interviews and research with real projects. Here's what I found interesting:

Studies show that branch usage greatly increases with new adoptees of DVC.

Pre-DVC, 1.54 branches/month. With-DVC, 3.67 branches/month (though I worry about methods used to obtain this info)

The idea that prior to DVC, branches were created only for releases, not new features.

To effectively use DVC branches, create one for each new feature, localized bug fix, or maintenance effort.

Studies show that even with DVC, a central repo is still used. (It is important to admit this, IMO)

An accessible DVC repo enables anyone to contribute to the project. Developers without commit privileges were reduced to working w/o VC. Accepting changes from unofficial project members has high barriers.

Academics advise us to checkpoint code at frequent intervals in a place separate from the 'team repo'. Only tested and stable code should be integrated into the 'team repo'. DVC systems enable and encourage this practice.

The term "Semantic conflict" - All VC systems are good at syntactic conflicts, but not semantic conflicts.

Awareness of
'Distract commits', which are commits that are required to resolve merge conflicts.

Abstract. The adoption of distributed version control (DVC), such as Git and
Mercurial, in open-source software (OSS) projects has been explosive. Why is
this and how are projects using DVC? This new generation of version control supports two important new features: distributed repositories, and history-preserving
branching and merging where branching is easier, faster, and more accurately
recorded. We observe that the vast majority of projects using DVC continue to
use a centralized model of code sharing, while using branching much more extensively than when using CVC. In this study, we examine how branches are
used by over sixty projects adopting DVC in an effort to understand and evaluate
how branches are used and what beneﬁts they provide. Through interviews with
lead developers in OSS projects and a quantitative analysis of mined data from
development histories, we ﬁnd that projects that have made the transition are
using observable branches more heavily to enable natural collaborative processes:
history-preserving branching allow developers to collaborate on tasks in highly
cohesive branches, while enjoying reduced interference from developers working
on other tasks, even if those tasks are strongly coupled to theirs

Introduction

Purpose of Version Control

Create isolated workspace from a particular state of the source code.

Can work within one branch without impacting other developers

Purpose of branches

Should be 'cohesive' so that a team can work together on a branch

Keeps new features separate, and allows merging features when complete