Post Categories

Programming

The number of bugs produced by developers are legion but why are advanced debugging skills still rare in the wild? How do you solve problems if you do not have the technical know how to to a full root cause analysis across all used tech stacks?

Simple bugs are always reproducible in your development environment and can easily be found with visual debuggers in your favorite IDE. Things get harder if your application consistently crashes at customer sites. In that case often environmental problems are the root cause which mostly cannot be reproduced in the lab. Either you install a debugger on production machines of your customers or you need to learn how to use memory dumps and analyze them back home.

There are also many other tools for Windows troubleshooting available like Process Explorer, Process Monitor, Process Hacker, VMMap, … which help a lot to diagnose many issues without ever using a debugger. With some effort you can learn to use these tools and you are good to solve many problems you can encounter during development or on customer machines.

Things get interesting if you get fatal sporadic issues in your application which results in data loss or it breaks randomly only on some customer machines. You can narrow it down where the application is crashing but if you have no idea how you did get there then some industry best practice anti patterns are used:

You know the module which breaks and you rewrite it.

You do not even know that. If the problem is sporadic tinker with the code until it gets sporadic enough to be no longer an urgent problem.

That is the spirit of good enough but certainly not of technical excellence. If you otherwise follow all the good patterns like Clean Code and the Refactoring you still will collect over the years more and more subtle race conditions and memory corruptions in central modules which need a rewrite not because the code is bad but because no one is able to understand why it fails and is able to fix it.

I am surprised that so many, especially small companies can get away with dealing technical debt that way without going out of business. Since most software projects are tight on budget some error margin is expected by the customers they can live pretty well with worked around errors. I am not complaining that this is the wrong approach. It may be more economical to bring a green banana to market to see what the customers are actually using and then polish the biggest user surfacing features fast enough before the users will step away from the product. The cloud business brings in some fascinating opportunities to quickly roll out software updates to all of your customers with new features or fixes. But you need to be sure that the new version does not break in a bad way or all of your customers will notice it immediately.

Did you ever encounter bugs which you were not able to solve? What creative solutions did you come up with?