Everyone is aware of the problem of discovering the causes of a bug when it’s only present in one environment and, if it’s Production, the problem is even bigger, even if you have a solid error logging system in place.

Recently we faced this same situation and we didn’t have any clues to help us, only that the w3wp process was dying and the ASP.NET session remained locked. After some thought, we arrived at the conclusion that there was an infinite loop somewhere, and we had a vague idea of the “zone” of code where this was happening, but we couldn’t reproduce it in any other environment even after several hours of testing.

Some weeks ago one of my customers decided that one of its biggest ASP.NET web intranet projects needed a sort of architectural revision, mainly to support better its customers with built-in fault tolerance but also to unchain development of the various sub-projects through better separation between software modules.