It seems reasonable to me that if a serious bug is found in production by end-users, a failing unit test should be added to cover that bug, thus intentionally breaking the build until the bug is fixed. My rationale for this is that the build should have been failing all along, but wasn't due to inadequate automated test coverage.

Several of my colleagues have disagreed saying that a failing unit test shouldn't be checked in. I agree with this viewpoint in terms of normal TDD practices, but I think that production bugs should be handled differently - after all why would you want to allow a build to succeed with known defects?

Does anyone else have proven strategies for handling this scenario? I understand intentionally breaking the build could be disruptive to other team members, but that entirely depends on how you're using branches.

+1; a very provocative question. I can see both sides.
–
Carl ManasterJan 20 '12 at 11:46

23

You use the term "the build" to include "the tests", which is not a universal understanding.
–
Jay BazuziJan 20 '12 at 21:19

15

If your doing TDD you would write the failing test, then fix the code and then check-in. Thus you avoid a broken build.
–
dietbuddhaJan 22 '12 at 4:22

5

By the same logic you should shut down the clients' live instances until you fix the problem. No, you shouldn't break the build. Let the developer handling the bug add the unit test and the code changes together. No need to shut the whole proccess down.
–
Thanos PapathanasiouJan 25 '12 at 13:42

12 Answers
12

Check in a failing test, but annotate it with @Ignore("fails because of Bug #1234").

That way, the test is there, but the build does not break.

Of course you note the ignored test in the bug db, so the @Ignore is removed once the test is fixed. This also serves as an easy check for the bug fix.

The point of breaking the build on failing tests is not to somehow put the team under pressure - it's to alert them to a problem. Once the problem is identified and filed in the bug DB, there's no point in having the test run for every build - you know that it will fail.

Of course, the bug should still be fixed. But scheduling the fix is a business decision, and thus not really the dev's concern... To me, once a bug is filed in the bug DB, it's no longer my problem, until the customer/product owner tells me they want it fixed.

+1 I think you hit the nail on the head with "scheduling the fix is a business decision" - as a developer it's not my judgement call to decide if a bug breaks the build.
–
MattDaveyJan 20 '12 at 11:43

20

I think this is a very sensible solution. Especially if the next guy to check in some minor code, suddenly gets a "failed test" notice and thinks it's something he has done.
–
HolgerJan 20 '12 at 13:08

11

"To me, once a bug is filed in the bug DB, it's no longer my problem"... +1
–
jbergerJan 20 '12 at 15:38

15

@anon Exept at Toyota. A line worker sees a defect, then pulls the andon cord and the entire plant comes to a halt and management comes over and the line isn't restarted until the problem is fixed. Google Andon Cord. It's not a new concept. see : startuplessonslearned.com/2008/09/…
–
Christopher MahanJan 21 '12 at 5:10

3

@AndresJaanTack: This will of course depend on your methodology, but in general I'd disagree. At least in a business setting, scheduling work is a business decision - and that includes bug fixing. Sometimes, a new feature (or releasing on a certain date) may be more valuable than fixing a (minor) bug - in that case the bug fix must be deferred. "Fixing the bug now" would be inappropriate in that situation, because it delays more important work.
–
sleskeJan 21 '12 at 20:52

Because sometimes, you have time constraints. Or the bug is so minor that it isn't really worth delaying the shipment of the product for a few days needed to unit test and fix it.

Also, what's the point in breaking the build intentionally every time you find a bug? If you found it, fix it (or assign it to the person who will fix it), without disturbing your whole team. If you want to remember to fix a bug, then you need to mark it as very important in your bug tracking system.

I understand, I think the point is summed up in your first paragraph - it's not up to the developer to judge the seriousness of the bug, or whether it's a show-stopper, that's a decision for the wider business.
–
MattDaveyJan 20 '12 at 11:40

2

The question is what is the priority of this bug? It could be a OMG FIX IT NOW it could be Yea thats annoying we should fix it at some point it could be something in the middle. But not all bugs will hit at the same place on that spectrum.
–
Zachary KJan 21 '12 at 16:59

Tests are there to ensure that you don't (re-)introduce problems. The list of failing tests isn't a substitute for a bug tracking system. There is some validity in the POV that failing tests aren't only indication of bugs, they are also an indication of development process failure (from carelessness to badly identified dependency).

@yarek: The regression tests can go into the codebase at any time, they just need to be ignored until the bug is fixed. I usually develop them while diagnosing the problem, because they can then double as a debugging aid.
–
sleskeJan 20 '12 at 12:42

This is a good example of why "breaking the build" becomes a toxic part of a workplace where CI devolves into merely "Blame Driven Development". I have sat in lots of meetings where the PHB starts whinging about "carelessness" as if that's why the build broke. In such an environment, you would hardly intentionally check in something that broken the build. Otherwise the PHB will get upset. Break the build, wear the cone of shame. What a crappy practice.
–
Warren PFeb 9 '12 at 17:16

@WarrenP, there are sometimes other issues, but let's be clear, carelessness is the first reason why builds break. If you know that something break the build, why check it in?
–
AProgrammerFeb 9 '12 at 18:02

"Break the build" means to prevent a build from completing successfully. A failing test doesn't do that. It is an indication that the build has known defects, which is exactly correct.

Most build servers track the status of tests over time, and will assign a different classification to a test that's been failing since it was added vs a regression (test that used to pass and no longer does), and also detect the revision in which the regression took place.

This isn't always correct, often teams consider a failing test as a broken build--in fact most of the teams I've seen lately do (It's a typical agile practice). With most agile teams a failing test is a case where you stop the line--the entire team attacks the problem and solves it. I suppose I could take your post to mean that the response has to be based on your practices, in which case it's totally accurate.
–
Bill KJan 21 '12 at 0:03

1

I always consider a failing test to mean the build has failed.
–
John SaundersJan 25 '12 at 19:01

@JohnSaunders: But that isn't what it means. As I said in my answer, it means "the build has known defects".
–
Ben VoigtJan 25 '12 at 19:50

I suppose there are different definitions. The one I use is "the build fails". Perhaps this is because I use TFS, which includes defect tracking, that I don't need a definition that means "there are known defects". "Known defects" are expected, until they are prioritized to be fixed.
–
John SaundersJan 25 '12 at 19:53

1

I have little problem with the immediate creation of a failing test. I just don't want it checked in to the set of tests that will run upon build. I also want the developer who fixes the bug to be able to ignore that test. In most places I've worked, the people who create the bug report will not be creating unit tests.
–
John SaundersJan 25 '12 at 22:04

I would argue that the failing test should be added, but not explicitly as "a failing test."

As @BenVoigt points out in his answer, a failing test doesn't necessarily "break the build." I guess the terminology can vary from team to team, but the code still compiles and the product can still ship with a failing test.

What you should ask yourself in this situation is,

What are the tests meant to accomplish?

If the tests are there just to make everyone feel good about the code, then adding a failing test just to make everyone feel bad about the code doesn't seem productive. But then, how productive are the tests in the first place?

My assertion is that the tests should be a reflection of the business requirements. So, if a "bug" has been found that indicates a requirement is not properly met, then it is also an indication that the tests do not properly or fully reflect the business requirements.

That is the bug to be fixed first. You're not "adding a failing test." You're correcting the tests to be a more accurate reflection of the business requirements. If the code then fails to pass those tests, that's a good thing. It means the tests are doing their job.

The priority of fixing the code is to be determined by the business. But until the tests are fixed, can that priority truly be determined? The business should be armed with the knowledge of exactly what is failing, how it is failing, and why it is failing in order to make their decisions on priority. The tests should indicate this.

Having tests which don't fully pass isn't a bad thing. It creates a great artifact of known issues to be prioritized and handled accordingly. Having tests which don't fully test, however, is a problem. It calls into question the value of the tests themselves.

To say it another way... The build is already broken. All you're deciding is whether or not to call attention to that fact.

@JohnBuchanan: What are the tests meant to validate, if not that the software is doing what it is supposed to do? (That is, that it meets the requirements.) There are, as you state, other forms of tests outside of unit tests. But I fail to see the value in unit tests which don't validate that that software meets the needs of the business.
–
DavidJan 20 '12 at 18:17

1

@JohnBuchanan - he didn't say "the tests ARE a reflection of the business requirements", he said "SHOULD BE". Which is true, but debatable. You're right in claiming that this isn't always the case - some people write unit tests that don't map to business requirements - although (in my opinion) they're wrong to do so. If you want to claim that David's assertion is wrong, you could say something about why you think so.
–
David WallaceJan 21 '12 at 0:29

In our test automation team, we check in failing tests as long as they fail due to a defect in the product rather than a defect in the test. That way we have proof for the dev team that hey, they broke it. Breaking the build is highly frowned upon, but that's not the same as checking in perfectly compilable but failing tests.

I think @MattDavey's idea is an excellent one, and i have argued for it in the past. I have always met a stone wall of resistance - "you should never break the build!". The idea that the build is already broken in this situation seems impossible for people to grasp. Sadly, this is just another case where a good idea (automatic testing and clean builds) has devolved into a cargo-cult practice whose adherents know the rule but not the reason.
–
Tom AndersonJan 21 '12 at 10:07

3

One idea i came up with is that the QA team (if they're technical enough to write tests) should be allowed to write failing tests for bugs, and check them in. The developers' obsession with the green bar would then lead to absolutely prioritising fixing bugs over adding features, which is the correct way to do development.
–
Tom AndersonJan 21 '12 at 10:09

But these should not be unit tests that will run during the build. If your environment contains a test management system for QA (like Microsoft Test Manager), then certainly, one or more test cases should be added and linked to the bug, but this would not prevent the build from succeeding - it would simply be a test case that has to pass before the bug is considered "Done".
–
John SaundersJan 25 '12 at 19:03

Writing a test that you know will fail until the bug is fixed is a good idea, it's the basis of TDD.

Breaking the build is a bad idea. Why? Because it means that nothing can move on until it is fixed. It essentially blocks all other aspects of development.

Example 1
What if your application is vary large, with many components? What if those components are worked on by other teams with their own release cycle? Tough! They have to wait for your bug fix!

Example 2
What if the first bug is hard to fix and you find another bug with a higher priority? Do you also break the build for the second bug? Now you can't build until both are fixed. You have created an artificial dependency.

There is no logical reason why failing a test should stop a build. This is a decision that the dev team needs to make (perhaps with management discussion) weighing up the pros and cons of releasing a build with known bugs. This is very common in software development, pretty much all major software is released with at least some known issues.

It depends on the role the tests are supposed to play and how their results are supposed to affect the build system/process adopted. My understanding of breaking the build is the same as Ben's, and at the same time we should not knowingly check in code that breaks existing tests. If those tests came in "later", it might be "okay" to ignore them just to make it less unnecessarily disruptive to others, but I find this practice of ignoring failing tests (so that they appear to pass) rather disturbing (especially so for unit tests), unless there is a way to indicate such tests as neither red nor green.

"unless there is a way to indicate such tests as neither red nor green" Just for the record: Most unit testing frameworks do distinguish red, green and ignored tests. At least JUnit and TestNG do (they report "xx test, x failed, x ignored").
–
sleskeJan 21 '12 at 16:52

@sleske that would be ideal, I just want to be sure that ignoring failures do not automatically turn into a success
–
prusswanJan 21 '12 at 16:57

Aren't there YELLOW builds? (Red/Green/Yellow) in Hudson/Jenkins, Cruise Control, and all the other biggies?
–
Warren PFeb 9 '12 at 17:18

@Warren P it works when people ignore tests correctly, but some ignore tests by making them green
–
prusswanFeb 9 '12 at 23:15

It depends on the bug, of course, but generally if something went wrong that wasn't identified during manual or automated testing, then that implies there is a gap in your coverage. I would definitely encourage figuring out the root cause and slapping a unit test case on top of the problem.

If it's a production issue that is planned for a hot fix from a maintenance branch, you also need to ensure that the fix works on the mainline, and that a developer can't erroneously blow away the fix with an over-zealous merge conflict resolution.

Additionally, depending on your release policy, the presence of newly updated unit tests can help confirm that a developer has actually fixed the problem, rather than merely changing it [(the problem or the tests?)], although this depends on having encoded the correct requirements in the new unit tests.

One problem with adding a know-to-fail test to the build is that your team may get in the habit of ignoring failing tests because they expect the build to fail. It depends on your build system, but if a failing test doesn't always mean "something has just broken" then it's easy to pay less attention to failing tests.

Another option is to check the failing test into a separate branch in your source control system. Depending on your practices, this may be feasible or not. We sometimes open a new branch for ongoing work, such as fixing a bug that is not trivial.

While I think you should in some way 'check in' the bug as test, so that when you fix it it does not happen again, and in some way prioritize it, I think it's best not to break the build (/the tests). The reason is that subsequently build-breaking commits will be hidden behind your broken test. So by checking in a broken test for this bug you demand that the whole team sets their work aside to fix this bug. If that does not happen you might end up with breaking commits which are not traceable as such.

Therefore I would say it's better to commit it as a pending test, and make it a priority in your team not to have pending tests.