I'm never sure when a project is far enough along to first commit to source control. I tend to put off committing until the project is 'framework-complete,' and I primarily commit features from then on. (I haven't done any personal projects large enough to have a core framework too big for this.) I have a feeling this isn't best practice, though I'm not sure what could go wrong.

Let's say, for example, I have a project which consists of a single code file. It will take about 10 lines of boilerplate code, and 100 lines to get the project working with extremely basic functionality (1 or 2 features). Should I first check in:

Go by units

What is a unit? It depends on what you're doing; if you're creating a Visual Studio project, for example, commit the solution right after its creation even if it doesn't have anything in it.

From there on, keep committing as often as possible, but still commit only completed "units" (e.g. classes, configurations, etc); doing so will make your life easier if something goes wrong (you can revert a small set of changes) and will reduce the likelihood of conflicts.

Leave traces on multi-person projects

My rule of thumb is to check in once my solution file (or other build script piece) is done, even if it contains a few files that are empty. It's good practice for when more than one person is working on the project. That file tends to have the worst merge issues initially as people are adding things to the project so needs committed early and often.

Even if you're the only one working on the project and it's only got one file, I find it easier to follow the same workflow and save the thinking for the problem at hand.

README development

The first commit can be a README file with as little as one line summary of the project, or enough information about the project's first milestone. The broad topics can also include:

Introduction

Project description

Project structure

Coding conventions

Instructions on how to:

build

test

deploy

Known issues and workarounds

todo list

Terms of use

The practice of updating the README before making changes to a project is also referred to as Readme Driven Development and it allows you to think through the changes before investing time in making those changes.

Anybody who wants to contribute to or use this software will then start with the README.

Find the original post here. See more Q&A like this at Programmers, a site for conceptual programming questions at Stack Exchange. And of course, feel free to ask your own.

33 Reader Comments

Agree with the README guy. Documentation gets missed too often so making it the first thing checked in is a good start. Of course if that's the only documentation ever checked in it doesn't do much good…

If there's *anything* that keeps you from checking in at *any time*, you're doing it wrong. Continuous integration, or checking into trunk, or merge issues are common excuses. Having code that doesn't compile or isn't fit for consumption by other members of your team is another. These are easily solved by checking into your own branch. But you should be able to check in as often as you like. Every time you save your files if you want. When you leave for the day, when you successfully compile - whenever. Anything that prevents you from checking in is an artificial constraint that could come back to bite you later. It doesn't matter if it's "done" or not, the question is if you lost your machine to a hardware failure, how much would it take you to get back to where you were?

I've had developers on my team that admit to not checking in for a week, and that's when I hit the roof. If that work had been lost, the cost is incredible. Plus, you then have the ability to risk going down a path that might not work out, and rollback at any point. Make your own branch. Check into it compulsively. Then, you can save all your excuses for when you're deciding whether or not to merge into trunk when the answer is less black and white.

An empty project. I always start a project (personal, job) with a name and source code control committed to a separate computer. That way the source code repo is my backup and a valuable info source for questions like when did I start doing things that way. Even uncompleted projects have been harvested for something that was useful.Also working with source control means I commit at least once a day (if I can continuously work on the project)

If there's *anything* that keeps you from checking in at *any time*, you're doing it wrong. Continuous integration, or checking into trunk, or merge issues are common excuses. Having code that doesn't compile or isn't fit for consumption by other members of your team is another. These are easily solved by checking into your own branch. But you should be able to check in as often as you like. Every time you save your files if you want. When you leave for the day, when you successfully compile - whenever. Anything that prevents you from checking in is an artificial constraint that could come back to bite you later. It doesn't matter if it's "done" or not, the question is if you lost your machine to a hardware failure, how much would it take you to get back to where you were?

I've had developers on my team that admit to not checking in for a week, and that's when I hit the roof. If that work had been lost, the cost is incredible. Plus, you then have the ability to risk going down a path that might not work out, and rollback at any point. Make your own branch. Check into it compulsively. Then, you can save all your excuses for when you're deciding whether or not to merge into trunk when the answer is less black and white.

Agreed. Asking when you should make your first commit on a project is like asking when you should first save a document you're writing; it's when you've done enough work such that losing it would be a very bad thing.

Of course, if you're using git, you can create your own branches willy-nilly and make commits until the cows come home.

Agreed. Asking when you should make your first commit on a project is like asking when you should first save a document you're writing; it's when you've done enough work such that losing it would be a very bad thing.

Of course, if you're using git, you can create your own branches willy-nilly and make commits until the cows come home.

The nice thing about git is you don't even have to make tons of commits; just keep staging stuff to the index every time you've made some progress. For instance, when function "foo" works, stage it. When function "bar" works, stage it. And then, when all that work finally adds up to a whole set of changes, make a commit out of it.

I tend to structure commits by units of work that can be verbally described in just one line. The commit history should look like a textual description of what has been done to me. The first line sums up the changes, then there's explanation how it was done and why so.

The first commit is usually the empty project with its name - at least when I suspect that this is going to be more than just five lines of code.Then every step, when the foot touches ground again, is a commit. Implemented function to rotate object by x degrees around axis y? That's the commit and its title. Changed object x to also keep track of how it is mangled by mangling_executer? See above.

If there is a thing too error prone to do it in one step, but still just one thing, not really making sense to be split in two, there's staging. If it's some tedious housekeeping stuff, like refactoring, that should be done in small steps but is just one easy described task, branch from main, commit every skirt-step, then merge with main again. Of course this is a loose definition. And it also implies the idea that main and its commit log should reflect the development of the project in an abstract manner, with each commit making a (small) difference.

At the end of the workday there is always a commit and if it is mid work, it is labeled as such and gets a small braindump with it, so when someone, usually me, reads it he knows what's been done and what's still to do.

The key thing for me really is to structure commits in chunks that take one sentence to be accurately described.

The nice thing about git is that you can do things like squash commits. But it's also a distributed system, so I think the original poster from Stack Overflow would need to specify a bit more about their workflow, whether or not their repo is a clone of a public shared one, or if it's just something they're keeping local.

If you spend some time understanding things like rebasing, branching, merging, and squashing in the context of something like git, then the question of "how often should I commit" really becomes something more like "how should I manipulate my history to provide a clean, useful, and accurate history of the work that has been done."

Perforce has a best practices white paper. Read it. it's not too long or shamelessly self-promoting.

Basically you should always be working in your own branch and frequently pull from mainline, and check in to your own branch whenever you want. At least daily. Then merge to mainline when the feature is complete.

I do my initial commit right after I had to spend some time figuring out where I fucked up at. I figure at that point the code is sufficiently complex to warrant tracking. If I'm being smart, I'll commit as soon as the software does something right, even if it's just running a bunch of stubs.

Other than that I start committing when it becomes more than just me working on the code. At that point it becomes cumbersome not to. (beyond easy merging, It's nice to be able to play the blame game and win

You should avoid committing things that don't build or pass tests so you can bisect easily.

If you have trouble describing the commit briefly and succinctly, it's probably to big and should be split.

Don't be afraid to tell someone to fix a patch they sent you even if its trivial or a matter of preference.

There is no such thing as "when". Your could should *always* be under source control.

When I create a new project in Xcode, I always tick the box "create local git repository". When I'm working on a new website, the first thing I do is `svn mkdir` on the subversion server we use, then I make a checkout of that and start adding the actual website.

You should "make the first commit" before you have even written a single line of code.

If it's just an experiment, and you're not sure if it will actually ever be shared with anyone else, then at least make a local git repository. But always have at least something.

Basically you should always be working in your own branch and frequently pull from mainline, and check in to your own branch whenever you want. At least daily. Then merge to mainline when the feature is complete.

Do you mean pulling mainline to the branch you are working on? That's considered harmful in many setups.

If you just mean pulling mainline to another local branch, I don't really see the advantage. Although whether it helps might depend on the system and how much interaction it allows with remote branches.

When to commit? That depends on the version control system used and if you are working with others.

If you are working alone, I check in whenever a new function "should" work or when I've made significant progress. Hopefully, multiple times a day. Version control is a way to get back to that point again AND a way to get our precious source code to another device as a backup.As often as makes sense - daily.

When working in a team and on the same branch, then you need to be more cautious. Checking in any code that will break something else is bad. Often, I'll comment out broken parts or put huge if (0){} blocks around it and still check in the code as a way to let others see it AND get it to a different device.

With distributed VCS like Git, I tend to work in my own branch where I can act like I'm working completely alone, but still push updates to the main repository daily.

Back when I was new to coding, I'd take daily "ZIPs" of my development tree as a backup/checkpoint. Then I learned about VCS (SCCS and RCS) so I started using them. Over the years, I've used VSS, CVS, SVN, BZR, Git, and a few other tools. I'm still completely addicted to version control. It has saved my a$$ more times than I'd like to admit. I can't see any downside to using it. Having all that change history is a fantastic thing and having it stored in at least different places completely ROCKS!

If you are embarrassed by your code - welcome to the real world and try to do better every day, week, month, year. I've been coding professionally over 20 yrs and I wrote some crap code last week. It works, but it isn't elegant. If I had more time, there must be a better way to accomplish the same thing. I was out of time and needed something that worked. Bam, crap code. Checked in for everyone on the project to see. Oh well.

I'm not suggesting that testing is not important. I'm a huge believer in TDD and use it too, but test cases are just like code to me. Check-in early and often.

When working in a team and on the same branch, then you need to be more cautious. Checking in any code that will break something else is bad. Often, I'll comment out broken parts or put huge if (0){} blocks around it and still check in the code as a way to let others see it AND get it to a different device.

That is, of course, why you want autotest on commit. The commit is not accepted if it does not pass all the tests. It becomes very difficult to break other people's code and if it does, it's because they don't have enough testing.

Too easy. If you're using Git, commit as soon you are at a mental stopping point, or whenever you feel liking stopping, or whenever you just feel like it.

If you're using SVN, only commit when you are absolutely positive that your code is complete and will not cause errors for other's on your development team.

I like to start off with a directory structure but none of that matters, what matters is that if you have a central repo you don't wreck it with incomplete commits. Meanwhile if you have a distributed repo you shouldn't waste time worrying about it since it's yours.

I've had developers on my team that admit to not checking in for a week, and that's when I hit the roof. If that work had been lost, the cost is incredible.

You "hit the roof"? That sounds like you scream and shout and give someone a dressing down in front of all their colleagues. I hope it actually means that you take them to one side and calmly tell them that they should be checking in much more frequently.

Basically you should always be working in your own branch and frequently pull from mainline, and check in to your own branch whenever you want. At least daily. Then merge to mainline when the feature is complete.

Do you mean pulling mainline to the branch you are working on? That's considered harmful in many setups.

If you just mean pulling mainline to another local branch, I don't really see the advantage. Although whether it helps might depend on the system and how much interaction it allows with remote branches.

Maybe he means you should rebase often with mainline to make sure your patch set will continue to be usable. Things can change and the sooner you learn about and adapt to the change the better. Most of the time, it's better to stop and resolve the divergence earlier rather than later. The end goal is not just have a lot of branches but to eventually have those branches merged and eliminated as well.

Maybe he means you should rebase often with mainline to make sure your patch set will continue to be usable.

Most likely, at least that's what I do.

Work on a local feature branch, git pull to local master and rebase local feature branch as needed. And naturally rebase local feature branch before merging to local master and pushing.

In most ways Git is the most pleasant to use VCS I've ever used. The only thing I find it lacking is GUI (for Windows people who don't like command line) and that some of the commands can be a bit difficult to use correctly if you don't understand what they do.

Eg I usually merge --squash my changes from my feature branch when they are done. But if you do that and then try to rebase your feature branch from master it will cause you a lot of grief. (Because that mean Git doesn't understand that your local branch changes are the same as the squash merge. So everything conflicts.) When you understand that "you shouldn't do that" it's not hard to work around. But it would be helpful if the Git tools helped beginners out a bit more with stuff like that. (You already get command hints for some things, eg if you try to push when you have to do a pull and rebase first.)

Excuse my ignorance, but with GIT, does one make a branch on the remote server to which they can commit? I know GIT supports local also, but one of my main reasons for commit often is because I assume my computer may suddenly die taking all of my data with it.

If everything you're doing is local, then 80% of what I consider for a reason to constantly commit is moot.

If there's *anything* that keeps you from checking in at *any time*, you're doing it wrong. Continuous integration, or checking into trunk, or merge issues are common excuses. Having code that doesn't compile or isn't fit for consumption by other members of your team is another. These are easily solved by checking into your own branch. But you should be able to check in as often as you like. Every time you save your files if you want. When you leave for the day, when you successfully compile - whenever. Anything that prevents you from checking in is an artificial constraint that could come back to bite you later. It doesn't matter if it's "done" or not, the question is if you lost your machine to a hardware failure, how much would it take you to get back to where you were?

I've had developers on my team that admit to not checking in for a week, and that's when I hit the roof. If that work had been lost, the cost is incredible. Plus, you then have the ability to risk going down a path that might not work out, and rollback at any point. Make your own branch. Check into it compulsively. Then, you can save all your excuses for when you're deciding whether or not to merge into trunk when the answer is less black and white.

I agree 100% with this. Even when I am doing a major refactor is gets checked in many times as I get core pieces done, even if the code does not yet fully compile. And many times I have finished something or gotten 90% done and discover a much better way to do it, and I roll back and start over. Source control is your friend so use it constantly, and never go home without committing what you have at that time. You can always unroll it if need be.

Personally for me I work from home and the office so it is critical I check in constantly or I won't have my latest changes when I get to the other side

And any decent source control system (we use Perforce) supports branching and merging, and even if you work alone having a dev branch and a trunk is critical to be able to stage stuff into the working tree. All our developers have their own branch and check in constantly and we then merge into the trunk to combine stuff. Then we merge from the trunk into the release tree to push to the staging server and then to the live tree from there. Many times stuff is in trunk that won't make it into the release tree and live on the server for weeks, but we are able to stage it nicely.

Also when I am doing something that is going to take a while, I merge into my own separate dev tree to keep it out of my primary dev tree. Invariably something comes up while I am halfway through a project that might take me a week or two, and if the work is changing a lot of stuff it can be hard to merge just the necessary bits back into trunk for fixes or small feature requests. So a separate dev tree can help with that.

I always go with the check in early and often motto. If you end up in a situation where there are more than one person working on the project, then be sure that you are not committing code that breaks so they can continue working. Otherwise, should commit every time you feel good about the state of your code and you hit a point that you do not want to lose the work you've completed thus far.

Also if testing is part of your continuous integration plan when checking in code, then be sure the code you check in pass all current tests as well so as not to disrupt the other developers.

I never understood the reason for keeping it out of source control. People at my company pushed for doing it this way because they wanted the project to settle down before it was put in to source control. I can understand that maybe we don't know how we want to organize the code yet, but when we figure that out we can rearrange the code in source control. At one point someone had erased the code on the network drive where the master version was being stored. A fancy version of passing around a floppy disk. The code had to be rebuilt from people's local working versions. This illustrates the reason you put it in source control, even when it's rough and non-functional. How much time do you want to spend if you lost it tomorrow? How much code do you want to lose and have to rewrite if the code was lost? Unless those answers are "a few days" and "a 10K+ lines" even for the rough beginning of your project, you should put it in source control.

When you're working alone, you can commit when you want. But, the fastest way to piss off coworkers is to not commit your work that they need to get their job done. Especially if you're "that guy" that doesn't commit his work before he goes home at night, for the weekend, out on vacation ... and you roll in late to say "oh, you're already working on that? dude, I didn't commit my stuff yet, so you're going to have to do it over." In the military, "that guy" would tend to get soap-socked in his bunk at night.

My general practice is to commit as soon as I have enough code that reproducing it from scratch when I fuck it up would take more than a few minutes.

So in practical terms, as soon as I've managed to get some code to compile or run successfully and do something.

This, too.

Commit something that works, even if it's not "perfect".

You don't want to get stuck jacking around for an hour just to figure out that what you had from the start was as good as it's going to get ... yet you screwed with the code so much you can't remember what that first solution looked like. It's nice to have that safety net.

Your first commit should be as minimal as possible, so that all meat is committed in the same way, as non-initial commits. Your first commit should just have a minimalist README file basically naming the project, or otherwise to say, this project is for X. So your first commit is essentially the zero point, and then each normal unit of work you commit is committed the same way, rather than treating the first one special.

Never heard of README driven development, but it sounds like a really good idea.

mattmc3 is 100% correct. There is no such thing as too early or too often when it comes to source control commits. Use tags for important milestones though, and branch if you're refactoring or some other major change from a stable code base.