Data warehousing unites agency data

Data warehousing solutions have not had broad use among federal agencies, but they have proven extremely successful at helping a number of agencies achieve their missions.

Agencies using data warehousing, which combines databases across an enterprise to support management decision-making, report that the technology has given them a unified view of disparate databases, helping them get a better handle on the enormous amounts of data that support their daily activities. Some agencies even have begun using these solutions to reach out to the public via the Internet.

The technology has not quite caught fire within the risk-averse federal government, but observers believe agencies will not be able to ignore the potential advantages of data warehousing for long.

One agency that said it is making a cost-effective use of data warehousing is the Internal Revenue Service, which is saving taxpayers megabucks with its Compliance Data Warehouse. CDW, a terabyte-size decision-support system, enables 150 economists, research analysts and statisticians to search for ways to improve voluntary compliance with federal tax laws.

The 2-year-old data warehouse has solved a key problem for the tax collection agency: how to gain access to critical information for research and decision-making. The IRS' data, which previously was stored in legacy online transaction processing systems designed to process tax forms, "was nearly impossible to query or analyze," said Jeffrey Kmonk, information technology manager for the IRS' Office of Research.

The government estimated that the IRS' inability to analyze that raw data was costing billions in lost tax revenue. Kmonk estimated that the IRS has saved $250 million through CDW after a $2 million investment in hardware, software and implementation services. The savings came entirely from the IRS' improved ability to ferret out taxpayers who were claiming too many dependents.

"The IRS' mission has been refocused on compliance rather than enforcement," Kmonk said. "And this warehouse is a crucial tool for helping researchers find ways to boost compliance."

The government has invested billions of dollars in data warehousing initiatives over the past few years, but a far greater investment is warranted as federal agencies strive to do more with less and still improve constituent services, industry observers said.

"I think the federal government should be investing between $30 billion to $40 billion a year," said Michael Burwen, president of the Palo Alto Management Group, a California-based research and consulting firm. PAMG estimated that $23 billion was spent on data warehousing worldwide in 1998. That figure is growing at a rate of 50 percent per year, which will translate to $113 billion in 2002.

Observers said the information gathered by federal agencies is tremendous, but most agencies have yet to use warehouses or business intelligence technologies to uncover which programs are working and which are not. "That's why data warehousing can have massive payoffs in the federal sector," Burwen said.

There are many well-known benefits of being able to "slice and dice" massive amounts of data hidden in legacy systems. At the highest level, data warehouses harness technology to support business objectives.

Data warehouses also combine data from disparate systems into a single database that can provide a more unified view of an entire agency's data, said Alex Moissis, vice president of Americas marketing for Business Objects Inc., a San Jose, Calif.-based application developer for data warehousing.

Data warehouses provide access to strategic information so that users can be proactive and respond creatively to their constituents. By easing access to data, a warehouse can even enable citizens to access information over the Internet. "This trend toward citizen self-service is among the fastest ways agencies can learn to do more with less," said Mike Schiff, director of data warehousing strategies for Current Analysis Inc., Sterling, Va.

Schiff said the Environmental Protection Agency is leading this effort with its Envirofacts Warehouse, which is designed to provide citizens, EPA employees and private industry with browser-based access to information on environmental hazards in specific geographic areas. Envirofacts is at www.epa.gov/enviro.

From Health Care to Home Loans

Since the middle of last year, the Defense Department's Military Health System (MHS), considered the fourth largest health care organization in the world, has been running 11 regional data warehouses that are part of the Corporate Executive Information System.

Mike Mauro, the enterprise data warehouse systems architect for CEIS, Falls Church, Va., said the warehouses were designed to assist in re-engineering MHS "to help us understand better what we are doing in each region and how we are doing it. A data warehouse can also help decide what conditions bring the best return for quality of care." Mauro said the regional warehouses will help MHS determine whether it needs to alter its referral patterns within a region or for the entire system. The warehouses also will help determine when best to use telemedicine to support patient diagnoses.

However, many queries of the current warehouses would be best answered by an enterprisewide data warehouse, which is why CEIS Version II is under development. Field testing is due to start in July. If all goes well, the new data warehouse, featuring 2.5 terabytes of data, will be available for more than 2,500 users to query in September.

Currently being tested, the system is based on an IBM Corp. SP2 made of 96 RS/6000 CPUs running Informix Software Inc.'s XPS Version 8.3 and MetaCube Version 4.2, a relational online analytical processing tool. Business Objects' BusinessObjects product is used as the front-end interface for analysis and reporting. Mauro expects the system will cost about $16 million in hardware, software and implementation services.

Elsewhere, the Family Housing Administration's 85G data warehouse has been used during the past year to monitor the performance of loans made by 9,000 lenders in 22,000 cities nationwide. The warehouse was set up because the FHA needed a better way to monitor the performance of loans it insured. It collected data on 23.5 million loans - data that had been stored on nine legacy systems.

"Without automated tools, there is no way we could monitor those loans to see where problems exist," said Allayne Hyde, deputy director of the FHA's Office of Lender Activity and Program Compliance.

The warehouse primarily is used to look for patterns of defaults on loans, especially where loans went bad in the first two years. The World Wide Web-based warehouse is available to agency employees through the Department of Housing and Urban Development's Web site.

The initial warehouse has been so successful that the FHA is adding a data mart - a collection of databases smaller in focus than a data warehouse. Lending institutions will be able to access the data mart to see precisely how well their loans are performing and to check on how their partners' loans are performing.

"Mundane" Applications

Some agencies also are looking at building data warehouses for more "mundane" applications, such as IT operations support, according to data analysis software provider SAS Institute, Cary, N.C.

Most IT shops already collect performance metrics with their systems and network management tools, such as Computer Associates International Inc.'s Unicenter. A data warehouse can augment those products by pooling that data and running detailed analyses, said Jeff Babcock, vice president of SAS Institute's Public Sector Group.

"If you are collecting metrics over time, you can predict when those things might occur and do a better job of planning," Babcock said. Key applications include load balancing and capacity planning, which can support operations such as Web servers, telephone systems and networks.

Some customers, particularly in the Defense Department and the intelligence community, also are exploring security applications.

One emerging network security tool is the intrusion-detection system, which monitors network traffic for unauthorized users. Numerous organizations are looking to pull together the information on intrusions that they collect from multiple systems so that they can begin to develop a better understanding of the methods being used to attack their systems. A data warehouse provides a good tool for such an application, said Daniel Boyle, SAS Institute's director for DOD and intelligence business.

Where Are the Warehouses?

So what's holding the government back from making a greater investment in data warehousing and business intelligence technologies? One answer lies in the monumental effort under way to solve Year 2000 problems. Any money earmarked for Year 2000 fixes that has not been spent still cannot go toward other investments such as data warehousing, analysts said.

While many private-sector businesses may have been motivated by curiosity to invest in data warehousing, the federal government must have specific requirements to make such an investment.

"Agencies can't take risks with taxpayer funding, so they must model the benefits they hope to gain before they can make an investment in specific products," said John Bender, a federal data warehousing sales representative for Oracle Corp.

The government also has been forced to address data warehousing more slowly than private entities because of privacy and security issues. "Agencies are struggling with how to present information that should be made publicly available while at the same time protecting the privacy of individual citizens," PAMG's Burwen said.

Data warehousing suppliers claim there also has been a drawn-out educational process for teaching potential federal clients about the need for data warehouses over traditional databases. "Our toughest job has been convincing federal users that for ad hoc queries, a warehouse, rather than a database, is best," said Laura Gifford, practice manager for Sybase Professional Services, Bethesda, Md.

Despite the obstacles, users, analysts and suppliers maintain that the federal sector is poised to make a larger investment in data warehousing and business intelligence technologies over the next three to five years.

Agencies also might be pushed toward data warehousing because of the Government Performance and Results Act, which strives to make government maximize profits and minimize costs - just like the private sector. "This act is definitely increasing user interest in data warehousing," Bender said.

Finally, analysts said an accelerated investment in data warehouses and business intelligence technologies will be prompted by the remarkable power of word of mouth.

-- Reimers is a free-lance writer based in Germantown, Md. She can be reached at bdepompa@aol.com.

* * * * *

AT A GLANCE

STATUS: A few federal agencies report significant successes with data warehouses, which have allowed agencies to better analyze data, offer information to constituents and save money in the process. Still, the technology has not yet caught hold throughout government.

ISSUES: Agencies' data warehousing efforts may have been hampered by agencies' focus on Year 2000 issues and by privacy and security concerns. Data warehousing represents a potential risk to federal agencies hesitant to spend taxpayer dollars on new technologies.

OUTLOOK: Great. As Year 2000 work winds down and agencies continue to feel pressure to operate more like businesses, even the most risk-averse agencies will not be able to ignore the success stories associated with data warehousing for long.