My Build Is Bleeding

Ivan Moore has a possibly-clever solution to one of our currently worst problems: about half of the 18 servers that form our continuous integration service are running functional tests, and those tests are slow and getting slower. So we get exactly what Ivan describes: many people check in on what looks like a green build, and by the time it gets around to failing, we’ve no idea why it actually broke.

Is the answer lots of builds, cleverly organised as Ivan suggests to pinpoint “bad” commits? What about interactions between commits – A’s and B’s changes are OK individually, but clash when combined? To keep it all running, would I need hundreds of servers running on tens of virtualisation hosts and a new server room to put them in?