By submitting your personal information, you agree to receive emails regarding relevant products and special offers from TechTarget and its partners. You also agree that your personal information may be transferred and processed in the United States, and that you have read and agree to the Terms of Use and the Privacy Policy.

logical
data modeling, according to Pete Stiglich, senior consultant with Hinsdale, Ill.-based
EWSolutions and SearchDataManagement.com's data modeling expert. Pressed for time, money and staff,
companies charge forward with database and application development, only to learn later the costly
perils of skipping data modeling processes.

Part of the problem may be a general lack of understanding about the discipline, such as
appreciating the difference between conceptual data modeling and physical data modeling or other
modeling activities, Stiglich said. Physical data modeling, or defining physical data structures,
is essentially required to build a database. Typically, companies don't skip this step, but it's
frequently underestimated as just a technical task of the database administrator or developer. A
conceptual data model is a tool for understanding the business from a data perspective, rather than
modeling a system, Stiglich explained.

"For example, if you have entities like 'customer' and 'order,' how do those relate to each
other?" he said. "Can there be many customers on one order or only one customer per order?
[Conceptual data modeling] is a way to understand the business from a data perspective, so you can
then take that conceptual description and apply that to a particular solution. You can have a
conceptual data model that can be applicable to many applications."

The process of creating a conceptual data model also helps organizations uncover and define
common data objects and relationships, such as "customer" and "order," which might be used in
multiple applications. Many organizations fail to take this business-centric view, though, focusing
instead on physical data structures only, Stiglich said. Ostensibly to reduce project delivery
time, they skip conceptual data modeling, which should be done in the requirements and design
phase. But cutting this corner does not pay off in the end.

"You can miss a lot of requirements by not developing that conceptual data model," Stiglich
said. "That can have a dramatic impact on delivering the project on time and on budget."

One company that Stiglich worked with learned that the hard way. A company providing pet care
services skipped the conceptual data modeling step, which led developers to incorrectly make
assumptions about the relationship between "customer" and "pet." The system they put in place
assumed that "customer" and "pet" had a one-to-many relationship -- i.e. a customer can have many
pets, but a pet (supposedly) belonged to only one customer. Since each member in a household could
be considered an owner of a pet, the pet record had to be duplicated when someone other than the
original customer in a household brought the pet in for services. That created data quality
problems and reporting problems and -- worse -- negatively affected employees' perception of the
new system.

Despite stories like these, some organizations may still need to be convinced that the time and
resources required for data modeling are worth spending. Most of the costs are in employee and
project time, though technology such as data profiling tools can also support the process. The
long-term benefits make data modeling worthwhile, Stiglich said.

"It's easier, and it costs a lot less, to fix something up front in the requirements and design
phase than once it gets into development and construction," he said.

It can be very expensive to fix problems caused by poor data modeling after the fact, he said,
especially if the core structure of a database is affected. The problems caused by a poorly
developed system can ripple through an organization, propagating data quality problems and
skepticism over data accuracy. Worse, if a problematic new system is a source for data warehouses
or business intelligence applications, it can sully decisions and insights derived from those
systems.

Estimating the ROI of data modeling

There are specific project costs that can be affected by doing -- or not doing -- data modeling,
according to Steve Hoberman, Westfield, N.J.-based consultant, trainer and presenter, and author of
Data Modeler's Workbench and Data Modeling Made Simple. These are helpful to investigate when
building a business case for any data modeling activities.

Data quality can be seriously affected, so potential costs related to poor data quality are
important to examine and include in a business case.

Support costs can be affected because systems are more difficult to support without a data
model.

Training new staff on systems is easier with a data model -- or more time-consuming and costly
without one.

Integration costs can be affected because many systems will eventually need to be integrated
with other systems -- a process that's much easier with documented data models.

To help put the importance of data modeling in perspective for businesspeople, Hoberman has used
two approaches successfully.

"You can draw the analogy to an architect. What happens if you create a building without a
blueprint?" Hoberman said. "Another approach you can take is instead of focusing on why you need a
data model -- think of the things that a data model gives you, like better data quality and common
data definitions."

The Open Data Platform has arrived, but not all Hadoop vendors are on board. The initiative, aimed at boosting interoperability, formed a backdrop for discussion at the Strata + Hadoop World 2015 conference.