Do other people fix bugs when they see them, or do they wait until there's crashes/data loss/people die before fixing it?

Example 1

Customer customer = null;
...
customer.Save();

The code is clearly wrong, and there's no way around it - it's calling a method on a null reference. It happens to not crash because Save happens to not access any instance data; so it's just like calling a static function. But any small change anywhere can suddenly cause broken code that doesn't crash: to start crashing.

Calling it is a mistake, since it bypasses all the subsequent constructors.

I could change the object's lineage to not expose such a dangerous constructor, but now I have to change the code to:

Customer customer = new Customer(depends);

But I can't guarantee that this change won't break anything. Like my Example 1 above, perhaps someone, somewhere, somehow, under some esoteric conditions, depends on the constructed Customer to be invalid and full of junk.

Perhaps the Customer object, now that it is properly constructed will allow some code to run that previously never did, and now I can get a crash.

i can't bet your wife's life on it.

And I can test it from here to Tuesday, I can't swear on your daughter's life that I didn't introduce a regression.

If you're unwilling to change anything because it might break something, and you consider yourself unqualified to look at what might happen in other parts of the code, what are you doing here? Are you paralyzed at the keyboard because the line of code you're about to write might be wrong?
–
David ThornleyOct 28 '10 at 19:13

"Do Something" is often a call to other functions you wrote or functions from libraries, in either case the code should have been unit tested. The procedure itself is also unit tested leaving the chances for new bugs to be introduced when you fix one very low... If I build a bridge, I would test it first in a smaller scale rather than letting people that walk over the bridge die. If you aren't doing unit tests when worried about bugs, you're doing it wrong. In whatever case you fix it you are blamed, so instead of fixing it you could prevent it and not be blamed for at all.
–
Tom WijsmanOct 29 '10 at 1:16

2

'do [you] wait until people die before fixing it'! Holy cow, I seriously hope there is only one answer to this...
–
Chris KnightOct 29 '10 at 9:19

3

As a comment specifically about one thing you said: you don't know if somewhere else in the code, they rely on the esoteric behavior: so? If someone is abusing a clearly incorrect scoping rule as a hack in their code, then they're code is WRONG. Good OOP would have prevented that bug, but you not fixing because they used a bad practice is compounding the problem, leaving it open for further abuse, and making the system more unstable all the way around. Fix the bug, hope testing catches any problems, fix more bugs if it doesn't. The long term stability of the product is a vital metric.
–
CodexArcanumNov 1 '10 at 15:21

13 Answers
13

This depends wildly on the situation, the bug, the customer, and the company. There is always a trade-off to consider between correcting the implementation and potentially introducing new bugs.

If I were to give a general guideline to determining what to do, I think it'd go something like this:

Log the defect in tracking system of choice. Discuss with management/coworkers if needed.

If it's a defect with potentially dire consequences (e.g. your example #2), run, scream, jump up and down till someone with authority notices and determine an appropriate course of action that will mitigate the risks associated with the bug fix. This may push your release date back, save lives, wash your windows, etc.

If it's a non-breaking defect, or a workaround exists, evaluate whether the risk of fixing it outweighs the benefit of the fix. In some situations it'll be better to wait for the customer to bring it up, since then you know you aren't spending time fixing/retesting things when it's not 100% required.

Mind you, this only applies when you're close to a release. If you're in full development mode, I'd just log the defect so it can be tracked, fix it, and call it done. If it's something that takes more than, say, half an hour to fix and verify, I'd go to the manager/team lead and see whether or not the defect should be fit into the current release cycle or scheduled for a later time.

Very true. That's why (some) companies have technical managers, bug crawls, and so on: to prioritize things. I'm not saying don't take initiative - not at all - but you have to use good judgment. Taking the wrong initiative at the wrong time is called "being a loose cannon" and can kill a company. Also, the importance of a particular bug varies from company to company. For some programs, a typo in a dialog box is a low-priority cosmetic bug to be fixed in a future release, and in others it's a stop-ship emergency.
–
Bob MurphyOct 28 '10 at 21:19

5

+1 for log the defect. Other defects may take priority...
–
SHugMay 6 '11 at 21:06

Customers will ALWAYS find bugs. There's no hiding bugs from customers, ever. In the end bugs you introduce will always come back to you. Not fixing them is simply a bad professional practice. Professionals don't do this.

When customers find bugs, the company looks bad, not an individual developer. This is much worse for the company, so there's your case for making the change. If you really are not sure about making this change in fear of introducing other bugs, talk to a more senior developer, project technical lead or whoever else is in the position to make a decision on such change and subsequently handle consequences.

That's a sad truth. We had to deliver a project we tested like crazy, and still the customer managed to crash it within 3 minutes.
–
Oliver WeilerOct 28 '10 at 20:05

3

@Helper Method: Maybe if you had tested it like sane people it would've been better.
–
Vinko VrsalovicOct 28 '10 at 22:07

@Helper Method: Although, more seriously, if it was really three minutes this looks like the testers didn't know what were the real use cases expected by the customer.
–
Vinko VrsalovicOct 28 '10 at 22:09

8

@Helper Method: get the customer involved in testing (assumimng customer == client). Developers writing and testing code is trouble developers do not use software the same way. Developers are gentle. It's their work - their art: Customers bang on it like an ape on a panio. When possible, share the testing effort share the responsiblity. This does not fit into every enviornment, but working with customers as a team, not advisaries can bring about better software.
–
brian chandleyOct 29 '10 at 0:22

We're professionals here. If you find a failing path in the code that will cause a crash or incorrect behavior, you need to fix it. Depending upon your team's procedures, you likely need to file a defect, perhaps write a regression test, and check in the fix at the right time of the ship cycle. If it's a low priority bug, checking in the fix near the beginning of a milestone is always a good time because if you do cause a regression you won't affect the release cycle of the milestone.

Don't confuse this with refactoring or making performance improvements that are not related to a performance bug.

A distributed source control system where you can keep a separate repository of 'little bug fixes' and then easily merge at the beginning of a milestone is a great help here.

+1. Fix it. If the fix breaks something else, then that something else was broken anyway - fix it too. If your testing doesn't catch these breaks, then your testing is broken - fix that as well. You cannot let yourself be scared off changing the code by the fear of breaking something - that road leads only to utter ruin.
–
Tom AndersonDec 27 '10 at 13:18

Your examples definitely look like legacy code, having lots of technical debt and I sense there's the fear of change (BTW, this is not a criticism or a judgment). Your entire team has to acknowledge that you have this technical debt (so you alone aren't blamed for it) and then you can decide how you're going to deal with it.

In Example 1, if Save() doesn't access any instance data, what customer data does it save exactly? Start fixing and testing that.

In Example 2, it's easy to cover the speed calculator with tests and ensure that it calculates the result correct in all key examples.

In Example 3, there's danger of bringing dead code back to life. Should that code be eliminated altogether? What is the intent of the boolean condition under that if? Is it to ensure the string contains no invalid characters? Or to ensure it has "PO BOX" in it? The sooner you start addressing such questions, the better.

After you have fixed a number of such issues, have a sort of retrospective/post-mortem with your team. It is important to learn from the experience so that you can reduce your defect injection rate in the future.

All of the examples you gave seem to to have a common thread. You seem want to fix a bug that you don't fully understand. I say that because you note the possibility of unintended consequences on each one.

I would say its probably a huge mistake and as Ben Laurie writesdon't fix a bug you don't understand. In this famous example the Debian team broke the encryption for OpenSSL for Debian and derivatives like Ubuntu when they followed the results of an analysis tool.

If you believe there is a defect by looking at code make sure you can reproduce the defect in a way that the customer can see. If you can't why not spend your resources fixing something else.

You have good answers already. I will just add something about the issue of being afraid that something crashes.

First, in the ideal situation the software is modular, is architected correctly and there is good separation of concerns. In this case, the changes you make are highly unlikely to break anything as you will be in control of all the related code and there is no hidden surprises.

Unfortunately, the ideal situation is fictional. Regardless of the extent to which the coupling is loose, there will be coupling and hence the possibility of breaking something else.

The solution to this is two-fold:

Make the code as well structured as possible.

When the code is isolated enough that you know that nothing else will break, fix it.

Making the code well structured is not done by rewriting the whole code to a new architectural design. Rather, refactoring guided by tests is your friend here. In this step, you do not change the functionality.

The second step is that you fix the bug.

A few points are relevant:

Version Control is absolutely needed for this process.

The refactoring step and the bug fixing step are better committed to version control as separate commits, so each has a well defined historical functionality.

Don't fixate on the possibility of making another bug, you won't get anything done. Rather, think of making your code better. In other words, unless you know that you are increasing the bugs rather than decreasing them then you should do it.

Related to the last point: don't try to predict the future. Humans are biased to think that they are very good at prediction. In reality our prediction powers are short term. So don't worry about a vague undefined future bug. This is also the principle behind YAGNI.

The corresponding tool to version control in the bug world is the bug tracker. You will need that as well.

You need to fix bugs for two reasons: in order to satisfy the customer; and in order to be able to progress in your development. To use example 3 (the physics one): if the program satisfies the customer with such a broken equation, then there are many other parts of the software that are wrongly developed to mitigate this bug (for example the airbag deployment). If a version 2 (or 1.1) is required for this software, then you will find it more and more difficult to develop correct code based on this faulty one. You are then either heading for the big rewrite, or for the grand failure. Both of which you should avoid.

These are already more than a few points, so I guess I will stop here.

You appear to be focusing on #1, where #2 is the best place to sit. Sure, we programmers want our code to be right (#1), but people pay us for it to work (#2).

What you may or may not do to the code base that could accidentally introduce new bugs is irrelevant to #2's view of the current software. However, #1 matters for yourself or the maintenance programmer that follows. It's sometimes difficult to decide, but when #2 and #1 conflict, you have to know that #2 is clearly more important.

Neither. There is a third way: find a way to prove that "the problematic code" is actually causing issue from a business point of view. Confirm what you find with BA/QA or at least your manager. Then plan the fix when everyone is aware of issue.

There are more possible scenarios other than a valid bug in the cases you mentioned:

The code you are looking at are some dead code. It's possibly because like ysolik said: customers always find bugs. If they didn't, maybe the code never gets executed.

There was a WTF situation where the obvious error did have its purpose. (We are talking about production code, anything could have happened, right?)

Business/customers already knew about the issue but don't feel necessity to the fix.

Maybe more...

In any case above, if I am a manager, I don't want a developers to use his/her judgment and fix the "error" in place. Fixing the error in place may help in most of the time, but when it goes wrong, it may cause more trouble than its good intention.

If you only want to hire developers who don't use their own judgment, there's a lot of truly mediocre ones out there for you. You will have trouble hiring actual good ones, who have some confidence in their ability.
–
David ThornleyOct 28 '10 at 20:31

1

@David: don't extend my opinion to an inappropriate degree. Developers definitely need their judgment, but there should be a boundary. In this case, developers use their judgment to detect a potential bug and take further steps to address it.
–
CodismOct 28 '10 at 21:16

Or if you use a gated check-in, set that up, so you don't actually check-in broken code.
–
Anna Lear♦Oct 28 '10 at 20:12

3

Taking a long time is no excuse to commit shoddy code.
–
TobyOct 28 '10 at 23:24

@Toby: Who was talking about excuses? I'm currently working in a small team, not even half a dozen developers. The unit tests for the project take 1hr. Our policy is to run the tests that seem related to whatever you do, and then check in and let CI find out whether you broke something seemingly unrelated. This wouldn't work in a big team, but in a small one it saves a lot of time.
–
sbiOct 29 '10 at 11:59

Fix them if they are crash/data loss bugs. Shipping a program with a known data-loss bug is downright malicious and inexcusable.

If the bug is cosmetical or non-critical (can be avoided) then it should be documented and a workaround should be provided. Ideally it should be fixed, but sometimes it's too expensive to fix it for the current release.

Notice how every bigger software project has a 'Known Issues' section in the ReadMe which usually lists exactly that: Bugs that are known.

Knowing Bugs and NOT communicating them is IMHO only acceptable for really minor/cosmetical bugs.

Fix it, and have it tested. If you decide to keep known bugs just because you are afraid to find more bugs, your program becomes a minefield of ticking time bombs so fast that it's going to become unrepairable sooner than you think.

Since you are the master and the code is the subordinate, you may not be afraid to change it when you see it's wrong. Fear of the code ("it might retaliate by breaking somewhere else") is simply unacceptable.

If there's clearly a crasher, or something wrong, then you should fix it. If there's an ambiguity in the spec, i.e. if you find yourself thinking "well the customer might expect this, but then it looks like it might be a bug" or a problem in the spec, like "we've been asked to do this but it sucks" then you need to find out what to do. Throwing the code over the wall and waiting for customer feedback is bad - you might ask a product manager if you have one, or ask the customer before you deploy the product.

Remember, "we know about that problem and will fix it in a future release" is synonymous with "we know about that problem but don't care about you enough to avoid you dealing with it".