Are stale data evil?

When you’re a software engineer who produces software for enterprises like banks or assurances, then it is normal you have huge databases (several gigabytes). Such systems have an operative application where users do the daily business of the company and there are more informative parts (or strategic parts) of the systems which the management uses. At a first glance, there isn’t a problem with those two views, but as you probably know, those companies have for the second part for the management a data ware house solution.

But what if your customer doesn’t want a data ware house solution? Or if he couldn’t afford one? Then you will probably add reports, search views to your application. In this blog post I describe some of the aspects if you’ve to choose this variation.

Stale data as a requirement

Unfortunately the question "how old can the data on this report/search be?" is rarely asked. When the answer is "The report/search has to show the right data", then you have to ask the customer again. The problem is, the data is maybe already stale after the query, because somebody changed some data.

In my experience there are only a few reports, which need as little as possible stale data. But it is essential that you ask this question.

Isolate only as far as needed

Most searches or reports need essential tables in your relational database, so it is important that those searches or reports don’t have an effect to your daily business. You ask yourself maybe now, how those queries could have any impact?

If you use Microsoft SQL Server, then the default isolation level is "Read committed". If a query isn’t that clever made, it could happen, that the query blocks a whole table (Intended Shared Lock which blocks any inserts or updates). If that happen, your users will remark that by waiting while they try to save their data.

When you create a search or an report you have to ask yourself always, which Isolation level you will use. When you use dirty reads (Isolation level "Read uncommitted"), then you’ll probably never generate any locks, but you have to deal with data which is wrong. This because data could be roll backed and the same query wouldn’t bring the roll backed data again.

Conclusion

Stale data or even wrong data on a search or an informational report hasn’t to be wrong or a mistake. Sometimes it’s just good enough to fulfil the requirements and make the customer happy. And that’s what it’s all about.

Interesting books

Pages

About me

My name is Patrick Weibel and I'm a software architect. I am passionate about software engineering and software architecture. It is very important for me to do continuous learning and to improve myself.

Disclaimer

The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.