Unless there’s a specific reason why you can’t, I recommend that you commit code to your repository in the smallest possible atomic chunks.

Look, it’s great that modern source control systems allow you to commit an atomic change to more than one file at a time. This is an essential feature and I can’t imagine living without it. But just because we can, that doesn’t mean that we should.

Probably the biggest reason to keep changes small is to make it easier to track down which change caused a particular bug. If two changes are commingled into a single commit, you may have to manually disentangle them to figure out which one was responsible.

Large changes also make it complex to go back and dig through history. If you’re trying to understand why someone made a particular change to a particular file and are looking through the old revisions, you might be confused if you find that someone made several other changes to the same file at the same time. Are the changes interrelated? Hopefully the change’s description will explain, but old change descriptions are often less than fully illuminating in practice.

If a change is demonstrated to cause a bug, you might want to back it out. If other changes have been lumped together with it, you might unintentionally back out other, unrelated changes that did not cause the bug and might be desirable to keep in the tree.

Consider also the impact on other engineers who have changes in development. These engineers will need to merge their changes with yours. The larger and more invasive a change is, the harder it can be to merge with other changes.

One specific thing you should not do is combine cosmetic and functional changes in a single change. For example, while making a change, if you notice that a source file has tabs instead of spaces, and your coding policy calls for spaces, don’t reformat the entire file at the same time that you are making your other changes. The same goes for moving curly braces, making the text fit within a certain number of columns, using // vs. /* comments, etc. It’s fine to make these changes to clean up code to meet coding policies… just don’t mix them with substantive, functional changes to the code.

One common way people end up committing large changes is the dreaded “mass integrate”. That is, you have two branches, and you want to catch up the one branch with all the changes made to the other branch. In a mass integrate, rather than integrating each individual change over by itself, you integrate all of the changes together in one big commit. Mass integrates may touch hundreds or thousands of files.

Because they lump many changes together, they may introduce and fix large numbers of bugs all in a single commit, and it may be difficult to track down what caused what. They obscure file history, especially if the descriptions of the individual changes being integrated are not all copy-and-pasted into the mass integrate’s description. If the mass integrate proves to be unwise, you may not realistically be able to back it out without creating an even bigger mess.

Mass integrates into a long-lived branch, e.g., your trunk or a release branch, are a “worst practice” in software development. Mass integrates into a development branch are not such a problem; the problem arises when merging a development branch back into the main branch. Sometimes you may have no choice but to integrate a bunch of changes together (each change individually breaks things, and you need all of the changes or none for the tree to stay in a consistent, working state), but it can be massively disruptive for a large pile of changes to be thrown into a branch all at once.

[…] Let’s start with a simple example: a single project with just 2 engineers, where each engineer commits a single change once per day. Now suppose that both engineers, for some reason, decide to start committing their code in batches of 5 changes once per week instead. I’m not sure why they would do this; I see large benefits to keeping commits small. […]

[…] Yes, I know that’s what I said last time. But I’m learning this WPF stuff as I go, and on the off-chance somebody might someday stumble on this while they’re learning WPF, I think it’s worth blogging in small increments. This post is about binding a single object, which is useful info in its own right. The next one will be about binding a collection. I’m blogging in small commits. […]