Introduction

I've set up CI in my last two roles and have spent a little bit of time trying to simplify the process whilst doing it and thought I'd share a few 'secrets of success' for getting CI up and running.

What Makes Setting Up CI Easier

1. Generate Your Build Files

I wrote my own code generator that searches through my entire dev directory for project files (I work with .NET so it simply looks for *.csproj files) and generates a NAnt build file for each project file. I also generate the master.build file. If there isn't already an existing CodeSmith template for this type of thing, then it's not too hard to create your own generator (code generator = fancy string builder). I'm also happy to share the generator if anyone is interested (will put up on CodePlex).

Generating your build files is also important because you then know that all of your projects are built in a consistent manner. Plus when you make the build files cleverer, you can easily apply those changes across all of your build files at the push of a button (ok a few buttons maybe).

2. Automate and Control Your Database Update Process

This is a whole other topic in itself but I'll try to keep it brief (of course ignore this one if your app does not hit a database). The biggest challenge you will face is handling database updates effectively. Your build box, test box and of course production box all have their own database and ideally each developer runs his/her own local copy (unless size does not permit). The problem is, when a developer checks in his/her unit of work which includes a database change (new table and some data for example), when the build kicks off how do you ensure that those database changes are run against the build boxes database?

You must have an automated program that runs any new scripts and this program must be one of the first things run as part of your build process. So what if one of the scripts fails? Ideally your program will run all scripts in a transaction and simply rollback the transaction if any of the scripts fail. This will work on SQL Server but sadly on Oracle any DML script will cause a commit of the transaction so you must find another (like backup first and restore on failure).

Developers use the same program to update their local database and the program is also used when promoting to other environments. By the time you run it against production, you will potentially have hundreds of scripts (which I have experiences) so make sure you have a good naming convention (or other technique) that ensures the scripts get run in the correct order.

All changes to the database must be made with a script, no exceptions. (not exactly a new rule but it amazes me how many people do not follow it!)

I wrote my own program to handle this and I'd be very interested to hear what others have done. I've been meaning to put the source (C#) up on CodePlex for ages, let me know if you are interested and I'll let you know when it goes up.

See the link below for the Automated Database Updater on CodePlex!

3. Get 'Buy In' Before Starting

From developers to the CIO. The CIO needs to understand the benefits that this new process will provide because it is going to take time (and money) to get it setup (money and time that will DEFINITELY be recouped over time).

CI forces a more disciplined approach to software development, some developers will not like that so it's important that they understand the benefits that they will see with the new process (like knowing that doing a get latest will always return a working copy of the source!). Developers will get annoyed with the new process if they don't understand it properly and keep breaking the build.

4. Get A Decent Build Box

We are still working with a crappy old desktop. It takes longer to build and we continually have to clean it up as it fills up pretty quickly. Frustrating!

5. Make Rules From Day 1

#1 - If you break the build, it becomes your highest priority to fix it.

#2 - If the build is broken, no further checkins are allowed until it is fixed (except of course by the guy fixing).

#3 - No chicken runs...means no checking in just before you go home (This rule is the only one that is flexible.)

6. Clean Build

If you delete everything in your development directory on your build box (i.e. where all of the source code is) and then start the build, it....it should work! If it doesn't then there is a dependency in there somewhere that is not in source control.

Summary

Every developer I have spoken to who has used CI tells me there is no way she/he could work in an environment that does not use it, so it may present some challenges setting it up but it is well worth it!

If you are not doing CI....what are you waiting for!

Points Of Interest

I should also mention that this is one of the few articles I'm working on relating to build processes and 'simplifying the boring stuff' so you can focus on what you really enjoy...writing cool code! Other articles:

Share

About the Author

I live in Sydney and have been a developer for almost a decade now. I have a passion for technology and a strong interest in discovering 'better, cleaner, faster' ways to get systems out the door because I believe software development takes too long. If I have an idea I want to realise it as quickly as possible...plus writing systems for someone else I want to deliver quickly so I can move onto the next interesting project!

(1) A team of software developers are concurrently developing code and update a DB schema and data.

(2) When releasing a new software version - there is a need to support older databases that are already installed in customer locations.

In case (1) - the solution that worked best for me was to do automatic diff/merge of the database files that were commited by developers to the central version control repository. I've also integrated the diff/merge tool I wrote into the subversion version control system so my diff/merge operations were as smooth as working with simple source code files.

In case (2) - the solution that worked best was to have some kind of a DB upgrade framework that performs most of the upgrade operation automatically but also provides you with hooks to alter or augment its behavior if necessary (see my article on a DB upgrade framework for SQLite databases by typing "SQLite upgrade" in google).

These are two different concepts in my opinion and they require two different tools to execute effectively. Of course these two tools have a lot in common but they are different in function. One is a diff/merge tool and the other is an upgrade framework (a library).

I think there is a lack of good diff/merge tools in the industry. In particular - I can't seem to find good diff/merge tools that are capable of performing 3-way diff/merge operations (like we have with source code files for years).

I'm currently building my own tool to do this with SQLite databases. I don't like maintaining separate SQL script files for all changes. This solution is very cumbersome and error prone. Instead I opt for doing diff/merge operations as automatically as possible.

My tool parses the built in SQLite schema syntax (based on the syntax diagrams in SQLite documentation) and builds an internal representation of all schema objects. It than proceeds to compare and analyze all changes using a 3-way diff/merge algorithm.

This allows me to treat SQLite databases as if they are simple source code files.

An added benefit for using 3-way diffing is that most changes (both schema AND data) can be merged automatically without a hitch (unlike the standard 2-way diff which is much worse).

SQLlite is a very different animal to a production SQL Server, if you're like me, when a new version of our app is release the SQLlite DB just gets completely replaced...so there's no real need for scripted changes. Is this what your tool does, it builds a new DB from scratch?

One concern I have with using any kind of diff tool (e.g Redgates SQLCompare) is that some of the changes it finds in your 'Dev' database may still be 'work in progress' and you (the guy responsible for migrating those changes) have to (should) manually review the difference to ensure that the 'work in progress' stuff does not get propagated.

Suppose you are using SQLite for storing some kind of structured document format in your application.
Now suppose that version 1.0 of your application was working with the original DB schema and now your are rolling out version 2.0 that works with a more up-to-date DB schema.

How are you going to support backward compatibility for files that were created with version 1.0 ? You can't really ignore this problem because your existing customers will be very angry with you

Regarding your concern - you'd be surprised that the vast majority of changes can be merged automatically.. I used to hold the same opinion about source files until I moved to Subversion where everything is merged automatically all the time.

For the few times that I really need to check the changes I fire my tool in GUI mode and review all changes manually (I don't do this often). Of course - handling conflicts requires working in GUI mode as well.

#3 - No chicken runs...means no checking in just before you go home. (this rule is the only one that is flexible)

At my last place of employment we had a rule among the developers: If you left for the day with the automated build system broken and you broke it, then you owed all the developers ice cream. (It was only about eight people, but you get the idea.) Therefore, if I ever got to the point of doing a check it and it was near the end of the day, I would always delay the check in until the next morning.

Great article, and I completely agree with what you say about how important it is and how difficult it can be to get buy-in.

Please share the tools you've made to create the NANT and other supporting files and give us your feedback on which tools you like to use.

I like CruiseControl and CCNetConfig (a CodePlex project) for configuring the server. But CCNetConfig (at last attempt) didn't support running on 64-bit, and we had issues with TFS client support on 64-bit as well, so we had to have the VM rebuilt as 32-bit

Thanks for the article and bringing focus to this vital part of professional development. I'd be interested to hear what you think about setting up CI in environments that use multiple development branches.

The article doesn't provide any details at all to back up your claim. We keep all our db objects in SQL and run them on the DB for every build. Isn't that hard and breaks the build when someone removes a column used in a test, etc. Works fantastically.

We may have gotten off on the wrong foot. I was pretty abrupt for which I apologize.

Are you just talking about validating the database in CI? With that I would agree. It's the rollout to production that is the problem.

My problem with rolling changes out to production in CI - which I have seen done - is that it perpetuates the notion that, once a database is "deployed" it is an operational concern. I think that in part II[^] of that series of articles I mentioned, I make a pretty strong case for databases being versioned objects which are built in place as opposed to servers which are maintained by operations.

No, rolling out the DB change would be silly without the accompanying code objects. Rolling out everything without some sort of gate on any kind of moderately complex project would also not be a good idea. We run all the sql against a test database. That same SQL generated during the build is then run on the prod DB when it is time to deploy.

Yeah, see. The we're talking about only slightly different processes. I still believe there are better processes than the "prototype" test database but it's lightyears ahead of what a lot of people are doing. You seriously should take a look at some of the stuff I've written. It is one step further than using a test database: the focus is on creating a dependable, reusable builder which can be applied to static test databases (for exploratory testing), transient test databases (fixtures), and production.

If you need to get something up quick (say, after hours before everyone comes back and before a manager says "We can't afford that"), go grab TeamCity from Jetbrains (the Resharper guys). They have a "Professional" license that is VERY useful for small shops and proof of concepts.

I agree that CruiseControl is a good and open alternative that I did forget.

The important thing about this article is that it explains WHY you want to do this and just how important it can be to have another set of "eyeballs" to ensure that your code, as stored in the repository works. (Aside: You are using version control aren't you? - if not, CVS or Subversion are quick, easy and free to get started)

After you agree that it needs to be done, there are some good tools like CruiseControl and even TeamCity that can really get you going quickly and free (even with TeamCity) and doesn't require management buy-in and support to get it going. Just a couple of hours (tops, because you'll be playing with it a bit because it's cool) and you'll have a working system.

Thanks for reminding me of CC, I'm actually a bit embarrassed that I forgot them.

I would be interested in seeing what developers use for managing SQL change scripts.

I'm running into this problem at work. Without it being automated developers are sometimes to lazy to keep changes up to date.

On a site note:

Auto-generating nant build scripts might work if all you need to do is build projects. But every build system I've run into has to do way more than just spawn a few builds.

Get 'Buy In':

You're right on this one. Certain developers are always going to scream bloody murder when they have to change their develpment practices. Nothing will get their buy in prior to the changes. You'll just have to deal with their teething problems as they come up.