Category Archives: CI

When you start work on a code base, either from scratch or as you approach it as a new dev, there are a few things you should ensure are in place before you get going. Like most other things, these are easier and more natural on more mature platforms than it tends to be on .NET where the toy app tends to be king, but it is doable. I will here list some technologies I use and I may mention some competing ones for completeness. I can’t share most of the code because reasons, but if there is interest, I can put together samples showing specific techniques of the ones I have used.

Introduction

The checklist of what you need to put in place if it doesn’t exist is the following:

Version control

Build server

Automated tests run as part of the build process

Automated packaging

Automated deployment

Automated monitoring

Automated setup of development environment

I know most of that reads as “a house must have a roof” but there were, and still are, places where people go “We’re only a couple of guys, we can just leave the source on the NAS, it’s backed up”, so I’m just being clear here.

The last point may seem like overkill, but especially if you use the abomination that is 3rd party components, it is crucial to save time and more easily onboard new guys.

Version control

I’m going to shock you here and not require that you use Git. You might want to, because it’s becoming a skill everybody has, and GitHub is a beautiful place to keep your code, but if your workflow is centralised anyway. you can legitimately stay with Subversion as long as you take care of the basics, as in backing up the repository properly. Recent versions have less horrible diffing, so it is is not that bad anymore. The most crucial aspect is that you do need a version control system that has a good powerful command line interface.

Build server

I’m going to suggest TeamCity here, because I know it well, but it has some real drawbacks. GitHub comes with TravisCI which is a modern CI system. Jenkins has been popular. The point is – make sure your build configuration is the same on the development machine as it is on the build server, and make sure the build configuration is version controlled with the source, ideally that you can run all of it from the command line. This way you can know you are not about to break the build before you push your changes and also you know that you can recreate a previous version of the code without faff because a contemporary build configuration can be found in source control along side the source.

Automated tests run as part of the build process

The obvious bit here are to run your favourite command-line testrunner on your build output to make sure all your tests run on the build server.

You can always write powershell scripts to make simple but high-value integration tests, it doesn’t all have to be Selenium / FitNesse even though those are quite nice once you are over the initial hurdle, but yes, there is a cost.

Automated deployment

Many ways to deploy exist, OctopusDeploy is popular to use with TeamCity, but you should look at chef and puppet as well. They tend to be hostile to Windows, but it is now possible to use them, and they make sense. It is as if they are specifically designed for the purpose of deploying and maintaining software infrastructure.

Years ago I looked very briefly at Puppet, but I have come to work with chef recently and it does what it is supposed to do, but I have no factual reason to rate chef higher than puppet other than that I now know it.

For chef you can look at kitchen to test your deployment scripts in transient VMs. It can use Azure VMs, VMWare WorkStation or Vagrant / VirtualBox and it does shorten the feedback cycle considerably.

Automated monitoring

You can use pingdom, Monitis or Nagios or just run a few curl/wget in a scheduled task and send an email if they don’t return the expected information. Either way, you need to be able to know if things aren’t working. Use the smallest possible thing you can get away with if your budget is constrained, but do use something.

Automated setup of the development environment

This may be sensitive, as developers tend to be particular about where their code lives and how their machine is organised, but having the developers just be able to run a script and end up with all their development tools set up and ready to debug.

Some things are going to be difficult. Installing Redgate SQL Source Control doesn’t seem to be scriptable, but other than that you can:

install Visual Studio

Install plugins

Use chocolatey to install source control management

Get the sources locally

In some cases, as your system grows, and you break bits out into separate micro services, you will need this scripting to make sure you can set up the entire system to debugged locally. Ideally you would, as an additional means to verify your deployment method, use chef-zero or corresponding technology from Puppet to use your normal deployment templates as you configure the system on the development machines as well. This is another situation where clever IDE tooling actively makes things difficult for you, but any work you put in to automate here will pay huge dividends.

But what does this all mean in practice?

In short, if you are going to keep working on Windows exclusively, at least learn Powershell. It is not that horrible. The stance among Windows users have long been anti-scripting, and with the state of CMD.exe it has been for good reason. Powershell, though, has a lot of features that make sense even though the syntax can be confusing at first. Learning ruby may be more viable from a cross-platform perspective, but Powershell is evidently more suited for Windows and a lot of useful cmdlets for enabling windows features et cetera are only available in Powershell.

Me and a colleague have been struggling for some time deploying an IIS website correctly using chef. As you may be aware, chef is used to describe the desired state of the configuration of a system through the use of resources, which know how to bring parts of the system into the desired state – if they, for some reason, should not be – during a process called convergence.

Chef has a daemon (or Service, as it is called in the civilised world) that continually ensures that the system is configured in accordance with the desired state. If the desired state changes, the system is brought into line automagically.

As usual, what works nicely and neatly in Unix-like operating systems requires volumes of eloquent code literature, or pulp fiction rather, to implement on Windows, because things are different here.

IIS websites are configured with a file called web.config. When this file changes, the website restarts (the application threadpool does, to be specific). Since the Chef windows service is running chef-client in regular intervals it is imperative that chef-client doesn’t falsely assume that the configuration needs to be overwritten every time it runs is as that would be quite disruptive to any would-be users of the application. Now, the autostart behaviour can be disabled, but that is not the way things should have to be.

A common approach on Windows is to disable the chef service and to just run the chef client manually when you know you want to deploy things, but that just isn’t right either and it takes a lot away from the basic features and the profound magic of chef. Anyway, this means we can’t keep tricking chef into believing that things have changed when they really haven’t, because that is disruptive and bad.

So like I mentioned earlier – IIS websites are configured with a file called web.config. Since everybody that ever encountered an IIS website is aware of that, there is no chance that an evildoer won’t know to look for the connection strings to the database in that very file. To mitigate this well-knownness, or at least make the evil-doer first leverage a privilege escalation vulnerability, there is a built-in feature that allows an administrator to encrypt the file so that the lowly peon account that the website is executing as doesn’t have the right to read it. For obvious reasons this encryption is tied to the local machine, so you can’t just copy the file to a different machine where you happen to be admin to decrypt it. This does however mean that you have to first template the file to a temporary location and then check if the output of the chef template, the latest and greatest of website configuration, is actually any different from what was there before.

It took us ages to figure out that what we need to do is to write our web.config template exactly as it looks once it has been decrypted by Windows, then start our proceedings by decrypting the production web.config into a temporary location. We then set up the chef template resource to try and overwrite the temporary file with new values, and if there has been a change, use notifications to trigger a ruby_block that normally doesn’t execute, but when triggered by the template resource both encrypts the updated config and copies it across to prod.

Result!

But wait… The temporary file has to be deleted. It has highly sensitive information (I would like to flatter myself) and shouldn’t even have made it to disk in its clear-text form, and now it’s still there waiting to be read by the evildoer.

Using a ruby block resource or a file resource to delete the temporary file causes chef to record this as a change, and change that isn’t a change is bad. Or at least misleading in this case.

Enter colleague nr 2 “just make a bad resource that doesn’t use converge_by”.

Of course! We write a resource that takes a path and deletes it using pure ruby code, but it “forgets” to tell chef that a file was deleted, so chef will update the configuration when it should but will gladly report 0 resources updated at the end of a run where nothing has changed. Beautiful!

This is not a deeply technical post. Even less technical than my other ones. I write this inspired by a conversation on Twitter where I got asked about a few fairly advanced pieces of technical software by a clever but not-that-deeply technical person. It turned out he needed access to a website to do his human messaging thing and what he was given was some technical mumbo jumbo about git and gerrit. Questions arose. Of course. I tried to answer some of the questions on Twitter but the 140 char limit was only one of the impediments I faced in trying to provide good advice. So, I’m writing this so I can link to it in the future if I need to answer similar questions.

I myself am an ageing programmer, so I know a bunch of stuff. Some that is true, some that was true and some that ought to be true. I am trying to not lie but there may be inaccuracies because I am lazy and because reality changes while blog posts remain the same because, again – lazy.

What is Git?

Short answer: A distributed version control system.

Longer answer: When you write code you fairly quickly get fed up with keeping track of files with code in them. The whole cool_code.latest24.txt naming scheme to keep old stuff around while you make changes quickly goes out the window and you realise there should be a better way. There is. Using a version control system, another piece of software keeps track of all the changes you have made to the bunch of files that make up your website/app/program/set of programs, and let you compare versions, find the change that introduced a nasty bug et c. With Git you have all that information inside your local computer and can work very quickly and if you need to store your work somewhere else like on a server or just with a friend that has a copy of the same bunch of files (repository), only the changes you have made get sent across to that remote repository.

There are competing systems to git that do almost the same thing (mercurial is also distributed) and then a bunch of systems that instead are centralised and only let you see one version at the time, and all the comparison/search/etc stuff happen across the network. Subversion (SVN) or CVS are common ones. You also have TFS, Perforce and a few others.

What is a build server?

Short answer: a piece of software that watches over a repository and tries to build the source code and run automated tests when source code is modified.

Longer answer:

The most popular way for a developer to dismiss a user looking for help is to say “That works on my machine”. Legitimately, it is not always evident that you have forgotten to add a code file to your version control system. This causes problems for other people trying to use your changes as the software is incomplete and probably not working as a result of the missing files. Having an automated way of checking that the repository is always in a state where you can get the latest version of the code in development, build it and expect it to work is very helpful. It may seem obvious to you, but it was not a given until fairly recently. Any decent group of developers will make sure that the build server at least has a way of deploying the product, but ideally deployment should happen automatically. This does require that the developers are diligent with their automatic tests and that the deployment automation is well designed so that rare problems can be quickly rolled back.

What is Gerrit?

Short answer: I’ll let Wikipedia answer this one as I have managed to avoid using it myself despite a few close calls. It seems to be a code review and issue tracking tool that can be used to manage and approve changes to software and integrate with the build server.

Long answer:

Before accepting changes to source code, maintainers of large systems will look at submitted changes first and see if it is correct and in keeping with the programming style that the rest of the system uses. Despite code being written “for the computer”, maybe the most important aspect of source code quality and style is that other contributors understand the code and can find their way. To keep track of feature suggestions, bug reports and the submitted bits of code that pertain to them, software systems exist that can help out here too.

I don’t know, but I would think I could get most of that functionality with a paid GitHub subscription. Yes, gerrit is free, but unless you are technical and want to pet a server you will have to hire somebody to set up and maintain the (virtual) machine where your are running gerrit.

Where do I put my website?

Short answer: I don’t know, the possibilities are endless today. Don’t pay a lot of money.

Longer answer: Amazon decided, a long time ago, to sell a lot of books. They figured that their IT infrastructure had to be easy to extend as they were counting on having to buy a ton of servers every month so their IT stuff had to just work and automatically just use the new computers without a bunch of installing, configuration and the like. They also didn’t like it that the most heavy duty servers cost an arm and a leg, and wanted to combine a stack of cheaper computers and pool those resources to get the same horsepower at a lower cost. They built a bunch of very clever programs that provided a way to create virtual servers on demand that in actual fact were running on these cheap “commodity hardware” machines and they created a way to buy disk space as if it was electricity, you paid per stored megabyte rather than having to buy and entire hard drive et c. This turned out to be very popular and they started selling their surplus capacity to end users, and all of a sudden you could rent computing capacity rather than making huge capital investments up front. The cloud era was here.

Sadly, there was a bunch of companies out there that already owned a bunch of servers, the big iron expensive ones even, and sold that computing capacity to end users, your neighbourhood ISP, Internet Service Provider, or Web Hotel.

Predictably, the ISPs aren’t doing too well. They niche themselves in providing “control panels” that make server administration slightly less complicated, but the actual web hosting aspect is near free with cloud providers, so for almost all your website needs, look no further than Amazon or Microsoft Azure, to name just two. Many of these providers can hook into your source control system and request a fresh copy of your website as soon as your build server says the tests have run and everything is OK. Kids today take that stuff for granted, but I still think it’s magical.

Don’t forget this important caveat: Your friendly neighbourhood ISP may occasionally answer the phone. If you want to talk to a human, that is a lot easier than to try and get a MS or Amazon person on the phone that can actually help you.

Automatic deployment when you commit and push changes to your Github/Bitbucket repo is truly awesome and like running water or electricity that it is so convenient that you incorporate it into the fabric of your life and eventually only really notice it when it’s broken.The amount of reduced headache and increased productivity is amazing.

Overview

However, bringing continuous deployment to legacy web projects and solutions can be a bit challenging until it all works. The moving parts involved favor convention over configuration so in the most cases you don’t need to really do anything, but then there are cases when you all of a sudden notice that you need to do something to make things roll smoothly again.

Windows Azure Websites use a few tricks for the Deploy from Source Repository-feature to work. They put a git repository in your allotted folder on the IIS where your website resides which you can easily see via FTP. You can access this repo directly if you do a straight git deployment where an “azure” remote is added to your local git repo, otherwise it is used by WAWS to pull your changes from TFS, Codeplex, GitHub or Bitbucket and it is from there the deployment happens. Allegedly, even when you deploy from Dropbox, your changes are first applied to the git repo in your directory in Azure and then the deployment begins from there.

The deployment system is open sourced at https://github.com/projectkudu/and you find further documentation there.

Gotchas

Multiple deployable projects in one repo

Let’s say you have a solution to deploy that has multiple projects that are potentially deployable. Kudu will not be able to guess which one to deploy and the deployment will fail with an error. What to do?

The solution is a deployment configuration file, a .deployments file. Yes, Windows won’t let you right click and create a file called “.deployment”, nor rename an existing file to that name, so you need to save it directly to that name from Visual Studio or just type the following in cmd.exe: (yes, the command means to copy from the Console, as in what you type, so just type the config file and finish with Ctrl+Z which means End of File and press enter again).

After you add, commit and push that file, Azure will know which project to deploy. Further documentation on the .deployment file is available here.

Compilation fails with missing references

If you are using Nuget to manage your dependencies, you can choose either of two routes, Nuget Package Restore where Kudu will download the relevant libraries for you, or just committing your entire .nuget folder to your Git repository so that all your reference libraries are available for compilation on Azure.

The first option makes your repository more light-weight, but on the other hand you would be hoping that whomever manages the nuget packages on which you depend aren’t cheeky, like Microsoft have been in the past and withdraw the packages from under you, or that the maintainer misses a breaking change thus violating SemVer versioning causing Kudu to supply your app with an incompatible library.

With the second option you get a bulkier Repo, but you are in control of new versions of dependent libraries so that you can find out that they truly work with your system before you deploy. It works almost the same if you manually maintain your dependencies in the form of a Lib folder containing the DLLs that you need. Just ensure your Libs folder has been comitted to your Git repo and Kudu will be able to find them.