Four Key Dimensions of Scale: Insights from Gartner Data and Analytics Summit

I was fortunate to be able to attend the recent Gartner Data & Analytics Summit in Grapevine, Texas. There were several major themes that continued to dominate including: making analytics part of every action, business process and decision; developing business trust; building and executing a data and analytics strategy; and driving innovation through leading technologies.

Looking to the future, Gartner predicts that by 2020 artificial intelligence (AI) and natural language will be part of 90% of all BI platforms and 80% of all companies will have a chief data officer (CDO). According to the keynoters, by 2021 the office of the CDO will be seen as a mission-critical function comparable to IT, business operations, HR and finance in 75% of large enterprises. We can no longer process data like we did previously. We have to find new ways to scale, and the new technologies required were showcased in the presentations at this Summit.

In the latest CIO Gartner survey, technologies expected to help businesses differentiate themselves from their competitors were led again by BI/analytics at 26%, followed by digitalization/digital marketing at 14%, cloud services at 14%, mobility/mobile applications at 6% and IoT at 6%.

The opening keynote, presented by Gartner’s Carlie Idoine, Research Director, and Research VPs Kurt Schlegel and Rita Sallam, was entitled “Scale the Value of Data and Analytics.” They stressed that the need for data and analytics is pervasive, and it underpins every business model, every public service mission and even our personal lives. In addition, the dramatic rise in AI technologies and the associated influx of data from those technologies contribute even more data that requires analysis. In order to get the full value from data, according to the keynoters, it is necessary to accelerate analytical discovery by mastering the four key dimensions of scale: trust, diversity, complexity and literacy.

Establishing Trust in the Data Foundation

Carlie Idoine presented on establishing trust in the data foundation. Our call to action is to drive trust through verification, which can be done through crowdsourcing metadata creation, automating metadata creation with data catalogs, and balancing data lakes with data warehouses. In fact, Kurt Schlegel reminded attendees that “Data catalogs are the new black.” According to Schlegel, a data catalog maintains an inventory of data assets through the discovery, description and organization of datasets. The capabilities of a data catalog solution are to create an inventory of information assets, to collaborate for accountability and governance, and to communicate and share semantic meaning.

Data lakes were also discussed because they provide an architecture that can scale to handle the volume and variety that a digital business requires. The data warehouse provides the consensus needed to run the business. Because of the diversity of data in the data lake, Schlegel reminded attendees that, “We don’t just need lifeguards in the data lake. We need marine biologists who look at the quality of the water and tell us where it’s safe to drink.”

Promoting a Culture of Diversity

Rita Sallam stressed the importance of making diversity a core principle of an organization’s data and analytics program. Promoting diversity that you can’t see, reducing bias in algorithms by adding diversity, and leveraging diverse data sources were outlined as components of this effort. Gartner has elevated diversity and inclusion as a critical tenet for enabling innovation and scaling the value of data and analytics. Sallam mentioned that companies in the top quartile for diversity are more likely to have financial returns 33% above the industry mean.

Maximizing the Complexity of Running a Digital Business

Complexity was covered by the keynoters who described the importance of empowering many small teams and leveraging more precise data and analytics platforms that provide more context, more understanding and more timely purpose.

Driven by the massive amounts of data that we are collecting at every level within the organization, we are truly in a data-driven world that is more decentralized. Data is coming from all directions including social, behavioral, sensor, transactional, environmental sources and many more, thereby increasing the complexity of analytics and business intelligence.

Building the Data Literacy of Your Workforce

Gartner formally defines data literacy as: The ability to read, write and communicate data in context, including an understanding of data sources and constructs, analytical methods and techniques applied, and the ability to describe the use case application and resulting value. Creating a common language and culture around data was the call to action for scaling literacy. This can be accomplished by providing training in context, creating a certification system (akin to a driver’s license for analytics), and leveraging augmented analytics.

Another 2020 prediction from Gartner is that 80% of organizations will initiate deliberate competency development in the field of data literacy, acknowledging their extreme deficiency. Idoine stated, “Developing this type of data literacy can be disruptive. Assessing the data literacy of people who create and consume information is a critical step to ensure the organization is enabled with the right skills to meet current and future requirements of digital society.”

The Future of BI and Analytics

According to Gartner, BI and analytics beyond 2020 will be supported by AI at a higher level. For example, self-service analytics will be primarily handled by artificial intelligence. Data science and decision making supported by AI will be a common capability, and analytics will be better performed by computers than humans for many use cases. The impact of AI on organizations will be transformative.

Advanced Technologies

While at the Summit, I was able to connect with several companies that are providing the advanced technologies being highlighted by Gartner. Following is a brief summary of these companies’ capabilities:

ASG: ASG provides its clients with solutions for information access, management and control. For example, one of ASG’s clients implemented Data Intelligence for GDPR, Application Development and Change Management, Data Quality and worldwide tool consolidation. That client has more than 2500 applications, hundreds of data stores, dozens of technologies, and ASG DI software automates the collection of Privacy Information Inventory to be GDPR compliant.

Alation offers a data catalog built to deliver productivity increases for analysts, data scientists and other data consumers. Alation's Data Catalog uses machine learning to automate the collection of technical metadata and business context, balanced by a simple yet elegant interface to collect human inputs through crowdsourcing and collaboration in order to curate the catalog.

Arcadia Data is defining the next era of analytics and BI for data lakes. The company provides the first visual analytics and BI platform native to big data that delivers the scale, performance, and agility business users need to discover and productionize real-time insights. With Arcadia Data, business users have the ability to immediately discover new insights within data lakes in the cloud and/or on-premises without requiring a separate BI server or edge node and then it leverages its Smart Acceleration to productionize visual analytics to hundreds or thousands of users.

Attunity provides a data integration software platform to deliver data efficiently, in real-time and with no manual coding to traditional database, data warehouse, data lake, streaming and cloud architectures. Attunity software integrates with all major end points to non-disruptively replicate data from production sources such as Oracle, mainframe and SAP. Attunity also accelerates data lake pipelines by automating the creation, updates and provisioning of analytics-ready data. The company serves half of the Fortune 100, including Ford, Verizon and Cardinal Health.

Birst, an Infor company, showcased its next-generation platform for enterprise analytics and business intelligence (BI), which connects centralized and decentralized applications through a network of virtual tenants, on top of a multi-tenant cloud architecture. One of the platform’s distinguishing characteristics is its ability to deliver Augmented Analytics. Birst uses machine learning to intelligently discover business-critical relationships in the data and automatically build visualizations and dashboards. Advanced algorithms take raw data and instantly structure it in an organized and consistent set of business metrics and attributes.

Cambridge Semantics offers an end-to-end, enterprise-scale open platform that creates a single semantic layer of an organization’s structured and unstructured data. The resulting fully governed data fabric is represented as a Knowledge Graph, capable of managing all enterprise data while also enabling users to conduct code-free, rich interactive discovery and analytics at speeds more than 100x faster than other approaches. The company also provides graph-based online analytics support for Amazon Neptune and graph database users through a native graph-based parallel query engine.

Cloudera, based in Palo Alto, California, offers Cloudera Enterprise, a platform that includes Cloudera Analytic DB (for BI & SQL workloads based on Apache Impala), Cloudera Data Science & Engineering (for data processing and machine learning based on Apache Spark and Cloudera Data Science Workbench), and Cloudera Operational DB (for real-time data serving based on Apache HBase and Apache Kudu). Through their SDX (shared data experience) technologies, the platform provides unified security, governance, and metadata management across these workloads as well as across deployment environments. Cloudera’s platform is available on-premises, across the major cloud environments (including native object store support for S3 and ADLS), and as a managed service under the Cloudera Altus brand.

Databricks provides a Unified Analytics Platform that consolidates data science and engineering in one workflow to help data professionals bridge the gap between raw data and analytics. The Databricks platform powers Spark applications to offer a secure production environment in the cloud, and through a collaborative and integrated environment. Databricks democratizes and streamlines the process of exploring data, prototyping, and operationalizing data-driven applications.

DataScience.com offers an enterprise data science platform that centralizes data science tools, projects, and infrastructure in a fully governed workspace. By centralizing the entire lifecycle of data science work, data scientists can tackle projects faster – from data exploration to model deployment – and IT teams can manage resources and systems more efficiently. Leading organizations like Amgen, Rio Tinto, and Sonos are using the DataScience.com Platform to improve data science productivity, reduce operations costs, and deploy machine learning solutions faster to power their digital transformations.

Domo was a premier sponsor at the event, featuring a customer in their speaking session from EnerBank who talked about trust and empowering your people with data. Domo is the operating system that allows business decision makers run their entire business from their phone. Domo’s message is that they bring together data, systems, and people to finally deliver a digitally connected business. Domo’s booth focused on how they can help customers combine billions of rows of data and deliver insights that everyone from senior leaders to merchandisers can use to understand in-store performance and optimize processes across corporate functions.

erwin, Inc. was a first time attendee at the Gartner Data & Analytics Summit. They were thrilled to continue their momentum in the data governance space after leaving CA Technologies in 2016. erwin DM has been the most trusted name in data modeling for more than 30 years, and many of their loyal customers stopped by to say hello and hear all about their portfolio, which now includes enterprise architecture, business process modeling and data governance. The erwin EDGE platform delivers an “enterprise data governance experience” that brings together both IT and the business for data-driven insights, agile innovation, regulatory compliance and business transformation.

HVR: With HVR, organizations can achieve continuous data integration between legacy and modern systems for real-time analytics. Their scalable replication solution provides everything needed for efficient, high-volume data integration from beginning to end. HVR recently announced their “data lake” release, which includes features such as Amazon KMS support, native Hive support, Big Data Compare functionality, and metadata manifests.

Information Builders provides business intelligence, analytics, and data management solutions and helps organizations identify opportunities to harmonize, operationalize, and monetize data to create insights that drive action. At the show, the company hosted a session on creating an analytics culture to support digital evolution with IoT, as well as a session on monetizing data with embedded BI. Their customer Lipari Foods was also selected by Gartner to present their story on how they harnessed real-time data to support better decisions by front-line managers.

MapD (now OmniSci) developed its Extreme Analytics platform for business and government leaders who have to make timely decisions based on an exponentially growing amount of data. Born out of research at MIT, MapD is a breakthrough technology, the first to harness the massive-parallel-processing and visual rendering power of GPUs for zero-latency analytics on structured datasets with billions of rows. Jason Sanders of Verizon joined MapD founder and CEO, Todd Mostak, in a speaking session to describe how Verizon analyzes billions of rows of new data each week, using MapD to detect network anomalies and provide industry-leading levels of mobile service.

Matillion provides purpose-built native ELT solutions for cloud data platforms designed for data-driven companies that are leveraging the cloud to gain insights. Matillion software allows its customers to extract, load and transform data into and on cloud-based data warehouses, quickly and at scale.

Qlik, a leader in data analytics, empowers and enables organizations to see the whole story that lives within their data. Its portfolio of cloud-based and on-premises solutions meets customers’ growing needs from reporting and self-service visual analysis to guided, embedded and custom analytics, regardless of where data is located. By using Qlik Sense, QlikView and Qlik Cloud, its global customers gain meaning out of information from multiple sources and explore the hidden relationships within data to create actionable insights and lead strategic decision making.

Riversand works with customers who embrace data management to drive key initiatives such as digital transformation, operational excellence and strategic decision making. Together with our customers, we have pioneered data intelligence & automation to empower the use of better data. Riversand creates software that breaks the barrier between data and business by unifying Master Data across the enterprise and providing a smarter customer experience.

Semarchy provides an Intelligent Data Hub for MDM, application data management and data governance. They have organically grown an all-in-one hybrid multi-vector MDM solution. For example, Semarchy customer Chipotle was able to shift control of data from IT to the business user, leading to increased data accuracy, decreased system reaction time, and increased quality of customer service. With Semarchy’s POV approach, customers de-risk their implementation. They can experiment with a full working model and get started quickly.

SiSense provides technology that allows users to uncover insights from complex data. Sisense 7.0 delivers an intuitive, visual, drag-and-drop interface for data preparation that is used by non-technical business users to easily find, add, and combine complex data sources. It also takes advanced design and visualization concepts commonly used for the creation of dashboards and analytics and applies them in a new way. Sisense is expanding the application of embedded analytics, with client-facing embedding, internal embedding, and embedding via mobile devices.

Teradata: Teradata Everywhere offers the ability to run the Teradata Data

Warehouses on-premises, in public clouds (AWS, Azure), and private clouds (VMWare). New subscription pricing allows portability of licenses across all hostin] environments at the same price. Teradata Analytics Platform, announced in 2017, is the first in a series of expansions beyond the traditional data warehouse to encompass hundreds of parallel algorithms plus open source integration. At the Gartner Summit, Teradata CTO Stephen Brobst and Ciscos’ Maciej Kranz explained how Teradata is working with Cisco Kinetic smart cities providing sensor data analysis.”

Ron is an independent analyst, consultant and editorial expert with extensive knowledge and experience in business intelligence, big data, analytics and data warehousing. Currently president of Powell Interactive Media, which specializes in consulting and podcast services, he is also Executive Producer of The World Transformed Fast Forward series. In 2004, Ron founded the BeyeNETWORK, which was acquired by Tech Target in 2010. Prior to the founding of the BeyeNETWORK, Ron was cofounder, publisher and editorial director of DM Review (now Information Management). He maintains an expert channel and blog on the BeyeNETWORK and may be contacted by email at rpowell@powellinteractivemedia.com.