Why Companies Still Hand Code, But Don't Have To

I often interview vendors, and one question I like to ask is who they consider their competition to be. PR people hate this question because the last thing you want is your company talking about the competition. I understand that-I ask it anyway because it's a confusing field, and the answer is always interesting.

Whenever I ask this question to a data-integration vendor, without fail, they'll tell me their main competition is hand coding.

Last year about this time, I asked Phillip Russom of The Data Warehousing Institute why companies were still hand coding when they could buy tools that would do that work for them. He had no idea, but the question vexed him, since, in his words, hand-coded data integration is a "fairly old and arcane practice." He estimated that roughly half the data-integration solutions are coded from scratch. A 2008 study by an IBM user group confirmed his estimation, finding that 50 percent of companies still use hand-coded scripts to move data.

Sherman begins by examing how the use of ETL and other integration tools have evolved. He notes that data-integration tools now offer technologies and processes that extend well beyond basic ETL tasks. These suites can help with projects including:

And yet ... even these more robust tools still aren't very pervasive, with both small and global Fortune 1000 companies still resorting to hand coding. And this is where he finally answers our "why" question.

It boils down to this: Ignorance of the market, costs, a lack of resources and a weird adherence to corporate standards, even if enterprise-wide standards don't apply to the situation.

Fortune 1000-size organizations are turned off by the expense of licensing these tools for wider use. They lack the resources-aka, data-integration developers-for deploying the tools more widely, and the tools don't always fit a predefined corporate standard required by individual groups. In other words. sometimes you need an enterprise-class data-integration tool and sometimes you just need to move some data downstream. Companies could use ETL for the projects to move data downstream, but for some reason, people insist on the more expensive enterprise data-integration tool, which they can't afford for that particular project, so they opt to hand code. Crazy, right?

Likewise, the remaining "smaller" companies are turned off by the cost, but Sherman believes it's because they assume only Fortune 1000 companies can afford these tools. "From their perspective, you either have to pay for high-end tools or you hand code, and hand coding usually wins out," writes Sherman.

That leads us to another reason they don't use the tools: They're unaware of the market. To their detriment, they don't know about data-integration solutions that are priced within their budgets, because these are the tools that don't get mentioned by industry analyst or publications. (I hope we're the exception. I know I've talked to and about opens source solutions on more than a few occasions.)

He wraps up the piece with an admonition to expand your use of data-profiling tools, which he describes as more effective and efficient way to check your work after a data-integration project:

Data profiling should be established as a best practice for every data warehouse, BI, and data migration project. In addition to meeting project requirements, data profiling should be an ongoing activity to ensure that you maintain data quality levels.

And both groups don't really understand how far these tools have come and how versatile they can be, the article suggests. Fortunately, he has suggestions on how you can fix this problem, no matter which size company you're in, so you can wean yourself off hand-coding data integration

Subscribe to our Newsletters

Sign up now and get the best business technology insights direct to your inbox.

Loraine, you keep writing about subjects close to my heart. Our product at OpenSpan, allows you to, without a single line of "code" build integrations and automations against 10's of thousands of applications. HOWEVER, even some of my own SA's like to hand code. To my great annoyance I should add! To them sometimes, they think it's faster, even than drag and drop - but it's NOT. I've spent years trying to convince them, if they really think they have to drop to writing code, then we should have a GUI component to emulate it so others don't have to write code in future.

Now, me beating them up all the time, has given us a very RICH Visual IDE but you still can't change people's attitudes. They know they won't be around in 10 years to support what they build so perhaps they don't think beyond that. I have seen that attitude too. I ask my team, why do you write script when you could drag this component on and anyone that looks at it tomorrow or next week or next century, they can visually see your intent. Blank stares are common! So, is this at the heart of it?

I think so. I always think our technology problems are often people problems which you hint at. Our best developers get bored too quick to want to work on anything long enough to see it through - this is why I favor quick iterative wins that see a start-middle-end of a project in months and not years.

Since this people problem is unlikely to be solved - ever - integration is always going to be a need and re-use will be limited. This is the way it's always been and the TRILLIONS of lines of code (still being built), is a testament to that fact.

Last point. My developers argue strongly that Visual Coding, Component Coding or any other form of coding is STILL coding. When VB came along (compare that with Assembler - yikes), this was thought of even as a 3GL! PowerBuilder was a 4GL but it's just as difficult to integrate that "CODE" as it is something that was written in HTML last week!

There you have it. The need for your blogs and my product is because integration of legacy "code" in any form is here to stay

I think there is another factor in that most integrations and migrations are driven by in-house technical teams who often prefer to use power-tools or scripting languages, even if alternatives are available.

I know one UK corporate who has licenses for every imaginable product available to cover the ETL and DQ space yet chose to migrate one of their toughest migrations in SQL/Shell Script simply because the development team were comfortable in that environment.

I think the cost and training issue is also a problem.

Getting funding for any kind of data movement project is tough so despite all the ROI cases it is still a big hurdle to convince sponsors of the need to invest major sums for team training and licenses, no matter how obvious the benefits.

Prices are coming down though so I do expect it to change, I know on www.datamigrationpro.com we recently spoke with several companies who were actively moving from a scripting to a tools based environment.

Loraine, great article. However, there's an interesting additional point... I see lots of organizations using their ESB for ETL processes. Because they've often invested significantly in ESB tools and a surrounding team (an Integration Competency Center), many of thse teams are being tasked with avoiding the ETL oriented hand coding - but with the wrong tool set!

Dealing with issues such as ESB oriented batch processing - which should be an oxymoron - are taking up quite a bit of my consulting time.