Drowning in Data

Email this article

To*

Please enter your email address*

Subject*

Comments*

Recently, a major technology vendor sent out questionnaires to senior business managers about data and decision-making. A number of them came back with additional comments, most of them variations on a theme: “Data is buried in a sea of noise.” “Swamped in information.” “I’m drowning.” Despite—or perhaps partly because of—a sizable drop in the cost of storing and retrieving information, many corporations are in danger of being swamped by information. Software applications from ERP to CRM to SCM may generate great efficiencies, but they also generate great floods of data. So great, in fact, that nowadays CIOs speak of petabytes (quadrillions of bytes) of storage rather than mere terabytes (trillions), a trend that must surely worry the branding heads at Dayton-based Teradata, a subsidiary of NCR Corp. But not the sales heads: in a survey released by the technology company in September, more than half of 158 corporate executives said their businesses have two or three times the amount of information available to them as they had last year.

What’s more, a lot of that data is useless, or worse. Experts estimate that anywhere from 10 percent to 30 percent of the data flowing through corporate systems is bad—inaccurate, inconsistent, formatted incorrectly, entered in the wrong field, out of a value range, and so on. In its most recent study of corporate data integrity, the Seattle-based Data Warehousing Institute found that nearly half the surveyed companies had suffered “losses, problems, or costs” due to poor data. The estimated cost of the mistakes? More than $600 billion.

Now, the potential cost of poor data management is about to rise. Under Section 404 of the Sarbanes-Oxley Act of 2002, which goes into effect in June 2004, publicly traded companies will be responsible for providing “full, fair, accurate, timely, and understandable disclosure” in their periodic reports.

Recommended Stories:

Obviously, you can’t have accurate financials without accurate financial data. But identifying all Sarbox-relevant financial data and funneling it into a single report is no small feat. “It’s a big dumping of data,” says Mark Nagelvoort, an internal-controls manager who is heading up the Sarbox-compliance effort at Mahwah, New Jersey-based Hudson United Bank.

And that’s nothing compared with what companies may be forced to do with their unstructured data—the E-mails, contracts, and PowerPoint files that account for 80 percent of corporate information. Right now it appears that courts will treat such information as discoverable evidence in Sarbox-related prosecutions—an ugly prospect. Hence, many companies are now scrambling to archive as many E-mails, letters, and memos as possible. Warns James Watson, CEO at Chicago-based consultancy Doculabs Inc.: “Some companies are going from saving nothing to saving everything. It’s phenomenally dangerous.”

Dirty Rotten Data

Finance chiefs have been down this path before. In the mid-1990s, senior executives began routing data from far-flung financial, supply-chain, and customer-information systems into data warehouses and data marts.

On the drawing board, the projects made sense. By analyzing slices of company data, managers could spot trends and make better decisions. But in reality, data warehousing was, and is, fraught with difficulties; for every successful project, there is a failed one. And even with the successful ones, getting the right number can take forever. True, search and query speeds have improved dramatically in the past few years. Likewise, the cost of the software used to store the data has dropped. “The dot-com bust has brought down the cost of mature technologies like data warehousing,” says Danny Siegel, New York-based senior manager (global business technology) for Pfizer Inc.’s pharmaceuticals group. Siegel says data-mining tools cost one-fourth what they did a few years back (see “White Goods for Data?” at the end of this article).

Still, cheap storage and servers don’t guarantee good information. Indeed, bad data continues to bedevil many corporate data miners, asserts Michael Schrage, co-director of the E-markets initiative at the MIT Media Lab, in Cambridge, Massachusetts. “It’s an article of faith, almost to the point of a joke,” he says, “that every organization I’ve dealt with that wants to data mine looks at the quality of data and realizes it’s not sufficient to do it.”

To clean up the problem, some companies have turned to what’s known as ETL (extract, transform, and load) software. Sold by such vendors as Ascential Software, Ab Initio, and Informatica, ETL programs scour data before it’s routed into warehouses. Some companies use custom programs: Winston-Salem, North Carolina-based Krispy Kreme Doughnuts Inc., for example, uses homegrown data validation and exception applications that act as business-rule black boxes. The programs help keep funky data out of its financial-data warehouse, explains CIO Frank Hood.

Other businesses are attempting to reduce errors by reducing the number of inputs that feed such corporate reports as the general ledger. Earl Shanks, who has been CFO at NCR for two years, says the company used to deal with about 1,200 customized reports generated by the finance and administrative organization. A standardization project has reduced that number to just over 100, he says.

Consolidation projects could prove crucial in meeting regulatory requirements. Obviously, the fewer systems in place, the less data integration required. In addition, the deep-sixing of some programs should reduce the amount of chatter finance managers have to deal with. Case in point: Siegel recalls that before Pfizer deployed its financial-data warehouse, the company’s financial managers had to access 14 distinct systems. “A financial manager does not have the time to be an expert in 14 different systems,” he remarks.

Not surprisingly, some ERP vendors are flogging instance consolidation—that is, the adoption of a single version of a program—as the simplest way to comply with the new regulatory requirements. Since consolidating existing software tends to generate workflow efficiencies, it’s equally unsurprising that customers seem to be listening. A recent survey conducted by AMR Research revealed that 65 percent of publicly traded companies are strongly considering instance consolidation to help them deal with Sarbox (see “Six Degrees of Automation” at the end of the article).

Instance consolidation comes with its own difficulties, though. Cost tops the list. According to AMR, the price tag for an instance consolidation works out to about $10 million per $1 billion in annual revenues. What’s more, instance-consolidation projects can take anywhere from 12 months to two years to complete and often require a full reimplementation of the system.

Can You Demonstrate That?

For the moment, most finance executives aren’t worrying about instance consolidation. They’d be content instead to document exactly how numbers get rolled up into the general ledger.

At Hudson United, Nagel-voort was brought in to head up the bank’s compliance efforts and internal controls. He says the company’s executives are comfortable with the output of the controllers group. But, he adds, “if the CFO and CEO have to put their names on the dotted line…”

The ellipsis speaks volumes. With Sarbox approaching, finance managers will likely be fielding tough questions about data—particularly from audit committees. Says Mike Ressner, a former CFO who sits on the audit committees of WilTel Communications and Entrust: “Audit committees ought to be saying, ‘You’re representing this information as high quality and done with integrity. It would be nice if you could demonstrate that.'”

In all likelihood, internal-control departments will be respon-sible for the demonstrating. “Generally, controllers tend to be more focused on the general ledger or subledgers,” says Ressner. “Now they’ve pushed back out of those ledgers to the underlying systems to make sure the data flow is correct.”

Consider Hudson United, which operates about 200 branches in New Jersey, New York, and Connecticut. To help satisfy the Sarbox Sections 302 and 404 certification requirements, Nagelvoort put together a 12-member compliance team that is responsible for the bank’s business departments. Recently, Hudson United began installing a document-workflow program called SOXA Accelerator, marketed by HandySoft. According to Nagelvoort, the program helps the company create reports detailing what management and the company auditors consider material for each line item in the general ledgers.

Hudson United appears to be ahead of most companies—at least in the purchasing of compliance software. Many technology vendors see Sarbox as a big selling opportunity (the next Y2K problem, say some) and are pitching all-encompassing, large-scale compliance products, or “kitchen-sinkware,” as Schrage calls it. So far, few vendors have made a killing from these products. “Since they don’t touch sales,” notes Sid Banerjee, CEO at Falls Church, Virginia-based consultancy Claraview LLC, “there’s no ROI on these products.”

There may be no “R,” but there certainly is an “I.” The cost of purchasing compliance software can easily top the $1 million mark, say consultants. In addition, Watson of Doculabs points out that for every software dollar spent, corporate customers will have to spend $4 in service. “You’re talking about a boatload of consultants,” he says.

Further, some finance managers say they’re not overly impressed with the current crop of compliance products. When executives at Crown Media Holdings, an entertainment company that operates the Hallmark Channel, first started to assess Sarbox compliance, they contemplated buying software to help manage the company’s unstructured data. Specifically, Crown’s executive team was interested in a program that would manage the company’s voluminous contract rights—one of its key business processes. Although Crown has deployed a document-management program sold by Optika Inc., it has yet to purchase a rights-management application.

Deborah Birnbach, a partner in the litigation group at Boston law firm Testa, Hurwitz & Thibeault who advises clients on compliance, has heard similar complaints. “Companies will buy compliance software,” she predicts, “when they see other people buying it.”

On Deadline

They’d better hurry. Publicly traded companies have barely seven months to get in line with Section 404 of Sarbanes-Oxley.

The specter of that deadline leaves little time for businesses to dig too deeply into data deficiencies—or to automate data tracking. In fact, Todd Naughton, controller at Vernon Hills, Illinois-based bar-code maker Zebra Technologies, says that because of the June 2004 deadline, Zebra has ruled out deploying software programs for the first round of certifications. “It’s three to six months just to pick a tool and install it,” he claims. “And that’s before anybody starts putting data into it.”

Instead, executives at scores of companies are manually documenting the policies and procedures intended to safeguard the integrity of their financial data. The documenting includes not only assessing where the data is but also deciding who should have access to it. “For me, the problem is not having too much data,” says NCR’s Shanks, “but how do we use that data and make that data available to the right people at the right time.”

NCR maintains an enterprise data warehouse to help with that large task. For many businesses, however, Sarbox compliance remains a very low-tech affair. Zebra simply gathered 10 to 15 employees in a room with an Excel spreadsheet and went about identifying the company’s material risks and the controls to address each risk, says Naughton.

Observers say even low-tech approaches can carry some hazards, however. Former CFO Ressner worries that, for some companies, the documenting process could get out of hand, resulting in data about data. “The over-rotation of this,” he conjectures, “is that you could end up with a manual for everything.”

Ironically, some of this painstaking documenting could come back to haunt companies. According to attorney Birnbach, creating a paper trail about internal-control procedures before identifying what those procedures are may prove to be a big mistake. “It’s not wise to put an open discussion about the assessments of control processes onto paper,” she notes. “It’s discoverable.”

Ironically, some of this painstaking documenting could come back to haunt companies. According to attorney Birnbach, creating a paper trail about internal-control procedures before identifying what those procedures are may prove to be a big mistake. “It’s not wise to put an open discussion about the assessments of control processes onto paper,” she notes. “It’s discoverable.”

Wait ‘Til Next Year

Faced with stiff penalties for lax internal controls, some businesses will no doubt ignore that warning. Instead, they will start saving every bit of unstructured data in the house. “Companies will start saving all E-mail,” predicts Doculabs’s Watson. “Unless you’re confident in documenting policies and data, you’ll have to save it.”

What’s more, identifying a company’s internal controls and key financial processes—a Herculean task—is not a one-off deal. “We’ll have to update any change we make in internal controls, or if we install new software,” concedes Crown’s Thompson. “We’ll have to update anything that has implications for our flowcharts or internal processes.”

Eventually that will lead many companies down the path of automation. As Watson notes, merely backing up everything won’t cut it: “You need auto indexing, and you need rules and parameters for the indexing.”

A ray of hope for technology vendors? Possibly. Some finance managers say, yes, they’ll likely be more inclined to purchase compliance software in 2004—once they’re through with all their documenting, that is. Says Naughton: “I don’t want to do this every year.”

John Goff is a senior editor at CFO.

White Goods for Data?

When Section 409 of the Sarbanes-Oxley Act of 2002 kicks in for real—in January 2004—many companies will have to report material events to the Securities and Exchange Commission within 48 hours. Meeting the short deadline could prove to be a bear, particularly for businesses that plan to examine financial data to determine if an event qualifies as material.

The snag: data analysis is still anything but real time. Data warehouses, which nowadays house terabytes of information, are rarely updated on a daily basis. What’s more, the traditional architecture for warehouses—a patchwork of various drives, servers, and software—is best suited for backward-looking, slow-cooking sorts of analysis. The large amount of data movement in a typical warehouse limits the slice of information that can be accessed in a single search (usually about 1 percent of the available data). To get a fanfold view, users must engage in repeated queries. Says one CEO: “You go back and forth, back and forth.”

Appliances to the Rescue

Managers who need to perform ad hoc queries—and need the results now—are generally out of luck. Notes Dan Vesset, a research manager at technology consultancy IDC: “Speed of decision-making, whether we’re talking about real time or near­real time, is still only a goal for most organizations.”

This may be changing, however. A new device, called a data appliance, could radically alter the time it takes to analyze data. Built from the ground up as a dedicated storage, retrieval, and analytics system, a data appliance is an all-in-one machine. Since server, storage, and software are integrated at the lowest level, there’s less movement of data. The result? A 10-to-50-times improvement in performance for products from data-appliance maker Netezza Corp., claims Jit Saxena, CEO of the Framingham, Massachusetts-based company.

Netezza sells five data-appliance models, ranging in price from under $1 million to $2.5 million. The basic unit, a rack, can store up to 4.5 terabytes of data. To increase capacity, customers simply buy additional racks. As for the vendor’s performance claims, Wakefield, Massachusetts-based Epsilon, which hosts data for financial-services companies and others, recently installed a Netezza data appliance. Mike Coakley, Epsilon vice president of marketing technology, recalls the benchmarking the company performed on the device before making a purchase. “We tested load times, queries, summarizations,” he says. “The results were astronomical—borderline ridiculous.”

Coakley claims the data appliance has cut load times at Epsilon from 11 hours to 3. Complex SAS queries on an Oracle database, he notes, used to take 2 hours; now they take 15 minutes. Says Coakley: “This is a real shift.” —J.G.

Six Degrees of Automation

Costs and benefits of IT probrams for Sarbox compliance.

Technology Option

Costs and Efforts Required

Potential Benefits

% of Public Companies Considering This Option

ERP instance consolidation

Projects cost about $10M per $1B in annual revenue; often requires implementation of system; projects can take 12 to 24 months

Consistent processes across all units; much better visibility across the company; additional 25 percent decrease in IT maintenance costs

65%

Turning on controls within current systems

One of the least-costly options; may require help from a systems integrator to reconfigure the existing system