Ask any CEO if they want to better leverage their data assets to drive growth, revenues, and productivity, their answer will most likely be “yes, of course.” Ask many of them what that means or how they will do it and their answers will be as disparate as most enterprise’s data strategies. To successfully control, utilize, analyze, and store the vast amounts of data flowing through organization’s today, an enterprise-wide approach is necessary. The Chief Data Officer (CDO) is the newest member of the executive suite in many organizations worldwide. Their task is to develop and implement the strategies needed to harness the value of an enterprise’s data, while working alongside the CEO, CIO, CTO, and other executives. They are the vital “data” bridge between business and IT.
This paper is sponsored by: Paxata and CA Technologies

The growth of NoSQL data storage solutions have revolutionized the way enterprises are dealing with their data. The older, relational platforms are still being utilized by most organizations, while the implementation of varying NoSQL platforms including Key-Value, Wide Column, Document, Graph, and Hybrid data stores are increasing at faster rates than ever seen before. Such implementations are causing enterprises to revise their Data Management procedures across-the-board from governance to analytics, metadata management to software development, data modeling to regulation and compliance.
The time-honored techniques for data modeling are being rewritten, reworked, and modified in a multitude of different ways, often wholly dependent on the NoSQL platform under development.
The research report analyzes a 2015 DATAVERSITY® survey titled “Modeling NoSQL.” The survey examined a number of crucial issues within the NoSQL world today, with focus on data modeling in particular.

The competitive advantages realized from a dependable Business Intelligence and Analytics (BI/A) are well documented. Everything from reduced business costs and increased customer
retention to better decision making and the ability to forecast opportunities have been observed outcomes in response to such programs. The implementation of such a program remains a necessity for any growing or mature enterprise. The establishment of a comprehensive BI/A program that includes
traditional Descriptive Analytics along with next generation categories such as Predictive or Prescriptive Analytics is indispensable for business success.

This paper examines whether blockchain distributed ledger technology could improve the management of
trusted information, specifically considering data quality. Improvement was determined by considering the
impact of a distributed ledger as an authoritative source in TD Bank Group's Enterprise Data Quality
Management Process versus the use of standard authoritative sources such as databases and files.
Distributed ledger technology is not expected, or proven, to result in a change in the Data Quality
Management process. Our analysis focused on execution advantages possible due to distributed ledger
properties that make it an attractive resource for data quality management (DQM).

Change data capture (CDC) technology can modernize your data and analytics environment with scalable, efficient and real-time data replication that does not impact production systems.
To realize these benefits, enterprises need to understand how this critical technology works, why it’s needed, and what their Fortune 500 peers have learned from their CDC implementations. This book serves as a practical guide for enterprise architects, data managers and CIOs as they enable modern data lake, streaming and cloud architectures with CDC.
Read this book to understand:
? The rise of data lake, streaming and cloud platforms
? How CDC works and enables these architectures
? Case studies of leading-edge enterprises
? Planning and implementation approaches

Graph databases are about to catapult across the famous technology adoption chasm and land in start-ups, enterprises and government agencies across the globe. The adoption antibodies are subsiding as the power of natively connected data becomes fundamental to any organization looking for data-driven insights across operations, suppliers, and customers. Moore’s Law increases in storage capacity and processing power can no longer keep up with the pace of data expansion, yet how companies structure and analyze their data ultimately will impact their ability to compete. Unstructured, disconnected data is useless. Graph databases will rapidly jump from niche use cases to a transformative IT technology as they enable turning the data you collect into actionable insights. Data will become the single most differentiating asset for your organization.

This report analyzes many challenges faced when beginning a new Data Governance program, and
outlines many crucial elements in successfully executing such a program.
“Data Governance” is a term fraught with nuance, misunderstanding, myriad opinions, and fear. It is
often enough to keep Data Stewards and senior executives awake late into the night.
The modern enterprise needs reliable and sustainable control over its technological systems, business
processes, and data assets. Such control is tantamount to competitive success in an ever-changing
marketplace driven by the exponential growth of data, mobile computing, social networking, the need for
real-time analytics and reporting mechanisms, and increasing regulatory compliance requirements. Data
Governance can enhance and buttress (or resuscitate, if needed) the strategic and tactical business drivers
every enterprise needs for market success.
This paper is sponsored by: ASG, DGPO and DebTech International.

This report investigates the level of Information Architecture (IA) implementation and usage at the enterprise level. The primary support for the report is an analysis of a 2013 DATAVERSITY™ survey on Data and Information Architecture.
This paper is sponsored by: HP, Vertica, Denodo, Embarcadero and CA Technologies.

Learn how to get started with Apache Spark™
Apache Spark™’s ability to speed analytic applications by orders of magnitude, its versatility, and ease of use are quickly winning the market. With Spark’s appeal to developers, end users, and integrators to solve complex data problems at scale, it is now the most active open source project with the big data community.
With rapid adoption by enterprises across a wide range of industries, Spark has been deployed at massive scale, collectively processing multiple petabytes of data on clusters of over 8,000 nodes. If you are a developer or data scientist interested in big data, learn how Spark may be the tool for you. Databricks is happy to present this ebook as a practical introduction to Spark.
Download this ebook to learn:
• Spark’s basic architecture
• Why Spark is a popular choice for data analytics
• What tools and features are available
• How to get started right away through interactive sample code

When enterprises consider the benefits of data analysis, what's often overlooked is the challenge of data variety, and that most successful outcomes are driven by it. Yet businesses are still struggling with how to query distributed, heterogeneous data using a unified data model.
Fortunately, Knowledge Graphs provide a schema flexible solution based on modular, extensible data models that evolve over time to create a truly unified solution. How is this possible?
Download and discover:
• Why businesses should organize information using nodes and edges instead of rows, columns and tables
• Why schema free and schema rigid solutions eventually prove to be impractical
• The three categories of data diversity including semantic and structural variety

Acting Quickly – Or Not at All
The pace of business is accelerating. Enterprises must do more things, do them more quickly – and then adjust to market and competitive forces and do them differently. They must adapt in order to remain differentiated, and with that differentiation, hopefully build and sustain competitive advantage.

Deconstructing NoSQL: Analysis of a 2013 Survey on the Use, Production, and Assessment of NoSQL Technologies in the Enterprise
This report examines the non-relational database environment from the viewpoints of those within the industry–whether current or future adopters, consultants, developers, business analysts, vendors, or others.
This paper is sponsored by: MarkLogic, Cloudant and Neo4j.

SAP® solutions for enterprise information management (EIM) support the critical abilities to architect, integrate, improve, manage, associate, and archive all information. By effectively managing enterprise information, your organization can improve its business outcomes. You can better understand and retain customers, work better with suppliers, achieve compliance while controlling risk, and provide internal transparency to drive operational and strategic decisions.

Interactive applications have changed dramatically over the last 15 years. In the late ‘90s,
large web companies emerged with dramatic increases in scale on many dimensions:
· The number of concurrent users skyrocketed as applications increasingly became accessible
· via the web (and later on mobile devices).
· The amount of data collected and processed soared as it became easier and increasingly
· valuable to capture all kinds of data.
· The amount of unstructured or semi-structured data exploded and its use became integral
· to the value and richness of applications.
Dealing with these issues was more and more difficult using relational database technology.
The key reason is that relational databases are essentially architected to run a single machine
and use a rigid, schema-based approach to modeling data.
Google, Amazon, Facebook, and LinkedIn were among the first companies to discover the serious
limitations of relational database technology for supporting these new application requirements.
Commercial alternatives didn’t exist, so they invented new data management approaches
themselves. Their pioneering work generated tremendous interest because a growing number of
companies faced similar problems. Open source NoSQL database projects formed to leverage the
work of the pioneers, and commercial companies associated with these projects soon followed.
Today, the use of NoSQL technology is rising rapidly among Internet companies and the
enterprise. It’s increasingly considered a viable alternative to relational databases, especially
as more organizations recognize that operating at scale is more effectively achieved running on
clusters of standard, commodity servers, and a schema-less data model is often a better approach
for handling the variety and type of data most often captured and processed today.

Enterprise metadata management and data quality management are two important pillars of successful enterprise data management for any organization. A well implemented enterprise metadata management platform can enable a successful data quality management at the enterprise level.
This paper describes in detail an approach to integrate data quality and metadata management leveraging the Adaptive Metadata Manager platform. It explains the various levels of integrations and the benefits associated with each.

Metadata defines the structure of data in files and databases, providing detailed information about entities and objects. In this white paper, Dr. Robin Bloor and Rebecca Jowiak of The Bloor Group discuss the value of metadata and the importance of organizing it well, which enables you to:
- Collaborate on metadata across your organization
- Manage disparate data sources and definitions
- Establish an enterprise glossary of business definitions and data elements
- Improve communication between teams

Modern enterprises face increasing pressure to deliver business value through technological innovation that leverages all available data. At the same time, those enterprises need to reduce expenses to stay competitive, deliver results faster to respond to market demands, use real-time analytics so users can make informed decisions, and develop new applications with enhanced developer productivity. All of these factors put big data at the top of the agenda.
Unfortunately, the promise of big data has often failed to deliver. With the growing volumes of unstructured and multi-structured data flooding into our data centers, the relational databases that enterprises have relied on for the last 40-years are now too limiting and inflexible. New-generation NoSQL (“Not Only SQL”) databases have gained popularity because they are ideally suited to deal with the volume, velocity, and variety of data that businesses and governments handle today.

Data management is becoming more and more central to the business model of enterprises. The time when data was looked at as little more than the byproduct of automation is long gone, and today we see enterprises vigorously engaged in trying to unlock maximum value from their data, even to the extent of directly monetizing it. Yet, many of these efforts are hampered by immature data governance and management practices stemming from a legacy that did not pay much attention to data. Part of this problem is a failure to understand that there are different types of data, and each type of data has its own special characteristics, challenges and concerns. Reference data is a special type of data. It is essentially codes whose basic job is to turn other data into meaningful business information and to provide an informational context for the wider world in which the enterprise functions.
This paper discusses the challenges associated with implementing a reference data management solution and the essential components of any vision for the governance and management of reference data. It covers the following topics in some detail:
· What is reference data?
· Why is reference data management important?
· What are the challenges of reference data management?
· What are some best practices for the governance and management of reference data?
· What capabilities should you look for in a reference data solution?

This paper presents a practitioner informed roadmap intended to assist enterprises in maturing their Enterprise Information Management (EIM) practices, with a specific focus on improving Reference Data Management (RDM).
Reference data is found in every application used by an enterprise including back-end systems, front-end commerce applications, data exchange formats, and in outsourced, hosted systems, big data platforms, and data warehouses. It can easily be 20–50% of the tables in a data store. And the values are used throughout the transactional and mastered data sets to make the system internally consistent.

Data governance is a lifecycle-centric asset management activity. To understand and realize the value of data assets, it is necessary to capture information about them (their metadata) in the connected way. Capturing the meaning and context of diverse enterprise data in connection to all assets in the enterprise ecosystem is foundational to effective data governance. Therefore, a data governance environment must represent assets and their role in the enterprise using an open, extensible and “smart” approach. Knowledge graphs are the most viable and powerful way to do this. This short paper outlines how knowledge graphs are flexible, evolvable, semantic and intelligent. It is these characteristics that enable them to: • capture the description of data as an interconnected set of information that meaningfully bridges enterprise metadata silos. • deliver integrated data governance by addressing all three aspects of data governance — Executive Governance, Representative Governance, and App

Using ERwin Data Modeler & Microsoft SQL Azure to Move Data to the Cloud within the DaaS Lifecycle by Nuccio Piscopo Cloud computing is one of the major growth areas in the world of IT. This article provides an analysis of how to apply the DaaS (Database as a Service) lifecycle working with ERwin and the SQL Azure platform. It should help enterprises to obtain the benefits of DaaS and take advantage of its potential for improvement and transformation of data models in the Cloud. The use case introduced identifies key actions, requirements and practices that can support activities to help formulate a plan for successfully moving data to the Cloud.

This 2nd paper in a 3-part series by David Loshin explores some challenges in bootstrapping a data governance program, and then considers key methods for using metadata to establish the starting point for data governance. The paper will focus on how metadata management facilitates progress along three facets of the data governance program including assessment, collaboration and operationalization.

Add Big Data Technologies to Get More Value from Your Stack
Taking advantage of big data starts with understanding how to optimize and augment your existing infrastructure. Relational databases have endured for a reason – they fit well with the types of data that organizations use to run their business. These types of data in business applications such as ERP, CRM, EPM, etc., are not fundamentally changing, which suggests that relational databases will continue to play a foundational role in enterprise architectures for the foreseeable future. One area where emerging technologies can complement relational database technologies is big data. With the rapidly growing volumes of data, along with the many new sources of data, organizations look for ways to relieve pressure from their existing systems. That’s where Hadoop and NoSQL come in.

Newsletters

DATAVERSITY Education

We use technologies such as cookies to understand how you use our site and to provide a better user experience.
This includes personalizing content, using analytics and improving site operations.
We may share your information about your use of our site with third parties in accordance with our Privacy Policy.
You can change your cookie settings as described here at any time, but parts of our site may not function correctly without them.
By continuing to use our site, you agree that we can save cookies on your device, unless you have disabled cookies.