My Thoughts on Spanners, Universal BI Developers

The idea of universal developers has been spreading among BI people for a long time now. It’s time to take a critical stance.

A few days ago I read the 2010 TDWI article “The Spanner: The Next Generation BI Developer“, written by David Stoder, Fern Halper, and Phillip Russom. The authors argued that BI is a very agile and fast changing arena back in 2010 (and it is even more true today) and that BI teams do not meet the expectations and needs of businesses. To say it simply – BI is too slow.

The Spanner

They also showed us the interesting way this problem was solved by Eric Colson at Netflix. Colson introduced the notion of a “spanner” — an agile BI developer who builds an entire BI solution single-handedly. The person “spans” all BI domains, from gathering requirements to sourcing, profiling, and modeling data to ETL and report development to metadata management and Q&A testing. That worked great for Netlifx. Colson claimed that one spanner worked much faster and more effectively than a team of specialists. I strongly suggest you read the whole article. I don’t want to rewrite it here, but instead I would like to share my thoughts and experience related to “spanners” with you.

The One for Every Job

It makes sense, doesn’t it? The whole Agile BI movement itself is partially about having a team of “spanners”. Two of the twelve principles behind the Agile Manifesto are: “Build projects around motivated individuals. Give them the environment and support they need, and trust them to get the job done.” and “The best architectures, requirements, and designs emerge from self-organizing teams.” And it works!

There is no waiting for the rest of the team, no coordination, no dumb preferences for any specific BI layer, no fingerprinting in the case of failure. We did it often a very similar way in my previous company Profinit, and I would do it again anytime in the future.

What needs to be highlighted in “The Spanner” article is the importance of having the right people. And what does that mean? Simply:

more agile

flexible

open minded

with broad skills

with strong computer science background

But having them is not enough.

The Tools for the Universals

You have to give them the right tools to be really fast and effective. There are two parts of the article I really don’t like, both connected to tools. Let’s take a look at them.

Tool Number One

“Also, since spanners aren’t bound by a written contract (i.e., requirements document) created by someone else, they are free to make course corrections as they go along and “discover” the optimal solution as it unfolds.”

There should always be something written and shared between all parties. Being agile doesn’t mean you don’t have to write anything down. You need at least some basic info on what needs to be done for the business and why. In my experience, if you don’t understand why, you are going to deliver a big piece of shit crap [corrected by our marketing guy]. Also, when you force yourself to write something down, you make it concrete. Yes, you can do it badly and be very abstract. But still, it is much harder to pretend you understand what needs to be done if things are “on paper”.

And from the perspective of long-term maintenance, you simply need requirements and architecture specifications. Just try to imagine this typical scenario:

You have a Large BI system with millions or even tens of millions of lines of code, five years in production, a lot of changes made every month, and a new developer joins the team. You really believe this guy wouldn’t need anything to understand the system and being able to modify it? One of the most important things is the ability to navigate quickly through the existing BI environment without needing to spend a lot of time trying to understand what someone else did before. Because of that, you need both comprehensive and easy-to-use documentation.

Unfortunately, a lot of lazy guys abuse part of the Agile manifesto saying “Working software over comprehensive documentation” to avoid documenting at all. But it simply does not work. On the other hand, there are tools to help you a lot with this issue. And, OK, one of them is Manta Flow (it does reverse engineering of existing code and navigates you through the ocean of custom code in BI in a very easy-to-use and visual way).

Tool Number Two

“The only thing you need is a unifying data model and BI platform and a set of common principles, such as ‘avoid putting logic in code’ or ‘account ID is a fundamental unifier.’ …. Thus, with spanners, you no longer need …., a BI methodology, project managers , and a QA team, says Colson.”

Of course you need BI methodology. It is always there. Bad one, good one, heavy or lightweight, explicit and written one, or unwritten one but always there. And do you really expect you will have ten to fifty “spanners,” all of them smart, hardworking, and very consistent in how things are solved and done. No newcomers? 100% resource stability? Ufff, don’t make me laugh! My advice – choose the right people for your team and define rules as to how work has to be done. And adapt those rules from time to time according to your current situation. Those rules are your simple BI methodology. But even more importantly, you should find a way to check and enforce those rules. Again, you can find several tools out there to help you with this stuff (and, to be honest, Manta Checker is one of them).

So, to sum it up: a good article about the idea of “spanners” that works very well from my experience, but also some harmful and evil practices which I believe will hurt you if followed blindly. There are no silver bullets, no secret shortcuts, and no easy ways when dealing with complex issues.

MANTA is the central hub of all data flows in an organization, and with its lineage capabilities, it enables digital transformation. The platform allows information users to understand how data flows through all their systems and delivers actionable intelligence to boost governance efforts, accelerate development, shorten time-to-market, speed up the modernization process, ensure data quality, and enforce data security.