Big Visible TeamCity

Big Visible Cruise is a cool utility for adding a build information radiator to your team room. It makes your Cruise Control build status immediately clear with a large green (good) or red (bad) screen. My (admittedly limited) search for an equivalent solution for JetBrains’ TeamCity came up empty. TeamCity has a few built-in notification options, but none of them seemed to be the right fit. I toyed with many different possibilities, all of which would require more development or infrastructure than we wanted to allocate to the task. Luckily, we found a solution that cost us little time and requires very little deployment footprint.

On individual build configuration pages, there is an option to “enable status widget”. This makes the build status publicly available on your TeamCity server at http://<buildserver>/externalStatus.html. This is what it looks like when we expose the status of our two builds:

Now, we can put that up on a big monitor for everyone to see, but I don’t think it “radiates” any information – you still need to purposefully read it. However, through the power of GreaseMonkey and jQuery, we were able to modify that page to look like this:

Now, it is immediately clear to anyone in the room when one of our builds has a problem. The script also refreshes the page automatically every 15 seconds, which the default externalStatus.html page does not do.

You can easily recreate this using Firefox, the GreaseMonkey Add-on, and my bigvisibleteamcity.user.js GreaseMonkey extension. If you aren’t familiar with GreaseMonkey, just install the add-on, then drag the *.user.js file onto your browser window, and you will be prompted to install it.

You will need to edit the bigvisibleteamcity.user.js file to change the URL in the @include line to refer to the externalStatus.html page on your server.

I didn’t spend a lot of time on the styling, but the hooks are there if you want to. If you are familiar with javascript, it should be clear in the source how to change the refresh time, or to add support for multiple rows of boxes, etc. Let me know if you make any cool changes.

Credit due to Dovetail cohorts Chad Myers for the idea and Sam Tyson for bootstrapping me on GreaseMonkey scripting.

You can go a lot further than this and custom modify the TeamCity externalStatus.jsp page as well. This allows you to do some nifty things like only show failing builds, remove unnecessary info and add your own styling. We did some modifications and then stuck some monitors up on the wall which show all the failing builds.

I’ve described how you do this and have example versions of the files you need to modify on my blog here:

@Scott – The new display is an attempt to raise awareness for our acceptance/regression test (“slow”) build, which for various reasons, hasn’t received the same attention as our “fast” build. The build has been failing for an extended period of time, so the current indicator doesn’t raise the same level of alarm that it should. Now that we’ve addressed most of the issues that were making it difficult to maintain a green build, we want to retrain ourselves to think of a failed regression build as something that requires immediate attention. The new display will help with that retraining. It will prove its usefulness (or lack of) within a few days.

We use BVC for our product owners to know when to push out a new version or not, and also when an existing push is ready. They love it. We bought a cheap-ass big screen TV, a 52″ for like 100 bucks. Frickin rad.

@Scott

The BVC breaks out status by project/build. When you have 8-9 builds, CCNET really doesn’t give a great status for just one indicator. It’s at the point now where I don’t even look at my CCNET for status, just for quick access to the project websites. Looking up at the TV is just way easier, it’s the first thing I look at when I come in the morning, before I even turn my computer on. Well, second, the first is the story wall to see if there are any red post-its for blocking issues.

The drivers for allowing failed builds to stay failed are usually organizational rather than technical.

Did you allow the failed build to remain failed because of the kind of visual indicator you’re using?

It sounds like the problems with the failing build were fixed before this indicator was put in place. Have you changed the human and organizational predispositions that permitted the build to remain broken?

The presence of indicators doesn’t create responses to indicators. Indicators create opportunities to make decisions as to what to do with the information broadcast by the indicators.

If you didn’t have that screen with project stati, and your culture recognizes failed builds as a problem, you would use another indicator.

I’m not saying that the more convenient indicator isn’t a plus. I’m saying that the indicator is less important than the culture and the values that ultimately drives decisions that create outcomes.

A team that is predisposed to leave builds in a broken state will continue to leave those builds in a broken state if the team doesn’t value having those builds in a good state – regardless of how big the indicator is.

When there are a lot of indicators to consume, then we’ve gotta consider the format that the indicators are published with. The driver – the root cause – for the kind of interaction we have with indicators is more driven by the volume of information that needs to be consumed. Different interactions demand different formats, no doubt.

The fact that you don’t use the CCNet notification area widget isn’t a function of its usefulness, it’s a function of the presence of another indicator that has supplanted it – albeit a potentially more convenient indicator.

I’m not saying the screen with indicators isn’t useful – I’ve used them myself. I am saying that a lot of this often smells like geek spirit.

If maintaining clean builds is part of the team’s value system, and if you didn’t have the build status monitor on the wall, you’d still look at the other indicators you have.

It’s implied in control systems that the more glaring a visual control is, the more immediately and urgently that a negative signal is dealt with. Visual controls that are communally-accessible can also help in communal circumstances, like status meetings, etc.

I don’t disagree that visual controls are valuable, but I don’t believe that the situation that Josh’s team was in was caused by lack of information as much as lack of imperative.

Putting the failing builds on a monitor on the wall has had a massive effect. It is probably the single most effective change we have made (and one that took very little effort). Everyone who walks through the office can see the monitors including managers who don’t have TeamCity on their taskbar. No one wants to be on there for long. It has also brought the developers together as a larger team as all teams failing builds are on one monitor and it is everyone’s responsibility to try to keep it clear.

A little red ball on the task bar is easy to ignore. A bog red bar on a monitor everyone can see is not.

@Scott – the issues are still technical, but we believe we are now at a point where it makes sense to spend time on it.
Specifically, our issues revolve around Selenium RC and the various waitFor* functions which don’t appear to work as advertised. In our AJAX heavy application, these functions are critical to getting reliable test results (as opposed to the false failures we currently get due to timing issues). If you have any insight into common issues with these APIs, it would be appreciated.

Perfectly stable browser automation under CI is a fool’s errand. Closing the last mile gap is more expensive and less useful than having better-integrated testers and testing into the development process.

That said, I use explicit “done” markers in the html to prove that an ajax request is complete. I haven’t found more stable means.

That’s wishful thinking. Given the right context, a broken build displayed in a grid on a tv screen is just as easily ignored – especially the greater number of builds displayed. It’s the predisposition to ignore broken builds that makes this happen, not the predominance of indicators.

For example, first thing this morning many people noticed there were way too many broken builds on the board and there were various discussion about them. Everyone is now aware of the context behind these broken builds (even if they don’t affect them) and what needs to be done. These conversations simply would not have taken place if the only indicator was a little red blob on their taskbar.

It’s not just me that says it. Many people have noted the impact of the monitors and are really pleased with them. I can’t see how you can say this is wishful thinking!

I would encourage anyone reading this to ignore Scott and give it a go as it has had a fantastic impact in my organisation. Some advice though – I would advise only having information that is relevant. Why do you need to know if a build is passing for example? Use them like Toyota would bells or lights when there is a problem on the production line.

I’m not doubting that making embarrassing indicators more public can contribute to the motivation for keeping builds clean, but if you have to rely on being shamed into keeping builds clean then there’s a deeper root cause to address that is still there, lurking, waiting to cause another problem, or presently causing other problems that should be solved but aren’t embarrassing enough to address.