In the first part of this blog series I discussed the traditional data warehouse and some of the issues that can occur in the period following the initial project. I concluded by outlining the 5 common challenges that I hear from customers.

Inflexibility

Complex Architecture

Slow Performance

Old Technology

Lack of Governance

This blog will focus on the second of these issues, complex architecture.

In the previous posts in this series we have discussed the origins of the traditional data warehouse and outlined the basic architecture required. The following is a simple implementation architecture for a multi-source data warehouse designed to support traditional Business Intelligence workloads. An architecture that I am sure most of you reading this post will recognize.

Store and Optimize – the corporate memory where all the relevant data is structured, stored and optimised for performance reporting. Generally speaking this has traditionally been a relational database management system such as Oracle or SQL Server, but may occasionally be a higher performance analytics database such as Sybase IQ or an appliance from Neteeza or Teradata.

Front End Analytics – traditionally the bulk of information requirements were for structured paper based reporting that was predominantly historical in nature. This is often accompanied by a need to provide end user ad-hoc reporting and some level of KPI based performance dashboard. Typical solutions here; SAP BusinessObjects, Cognos, Microsoft Excel, Qlikview or any one or more of the many others available.

Good for Then, But What About Now?

On one level this type of architecture was fine when we were dealing with onsite, relational, structured data predominantly stored inside accessible SQL based databases. For some organisations this is still the comfortable reality. I am increasingly hearing from data warehousing and BI teams that are struggling to implement manageable architectures to address the complex and evolving needs of their users. For instance, what if you found yourself confronted with the following requirements:

The Marketing team need to build a customer churn / prioritisation predictive model. This will be built on cloud based mobile app behaviour data and CRM data from the data warehouse.

The Maintenance and Support team that wants to leverage new sensor technologies to get a real-time view of how customers are using their products. Perhaps this will be used to improve service levels or event offer more efficient preventative maintenance.

The Finance team want to be able to perform more advanced Planning, Budgeting and Forecasting operations using the data warehouse. Ideally without having to run endless extracts.

Senior Management would like to make use of their iPads. This creates a need to deliver real-time KPI information that can be drilled down into.

So what do these types of requirements do to our traditional “simple” architecture? They make it significantly more complex.

Source Data – understanding new data sources and whether they can be made available for downstream consumption.

Extract, Transform and Load – if we are no longer simply dealing with batch loaded relational data then we need additional connectivity solutions to handle new consumption models. If we are streaming data as well as batch loading then clearly we will need a different type of monitoring and scheduling / management layer.

Store and Optimise – so called “Big Data” requirements change the dynamic of our storage layer. Simplistically put we need to have a repository capable of addressing the 3 V’s (Variety, Velocity and Volume). In other words, unstructured, fast moving and high volume. Typically, this is where we see teams looking towards Hadoop and other similar solutions, but can they be used for a traditional SQL based data warehouse? Can Hadoop deliver the performance we need, if not can we get it to work with our existing data warehouse?

Front End Analytics – it is a fact that many business users have become tired of waiting for their IT teams to deliver requested functionality so have purchased their own departmental solutions. These are often beyond the scope of the original reporting tools that were implemented. Often these solutions duplicate existing data in order to address “advanced analytics” workloads such as predictive and statistical modelling. This creates more silos of data that need to be managed and updated to ensure they remain accurate and consistent.

It should be pretty clear by now that this organic sprawl of a data architecture is less than optimal. This bolt on approach to addressing changing requirements results in a number of key issues:

Integration – how do you get all these new technologies to talk to each other

Overlap / redundancy – many tools that have similar capabilities

Risk – who truly understands all of the components

All this adds up to a complex architecture, and that results in increased costs and lost agility.

So What’s the Answer?

Is there an alternative to organically stretching your existing data warehouse architecture, bolting on new capabilities as you go? Can we move to a new architecture that will deliver the kind of experience that our users desire?

Which of these collections of properties look like they would be the easiest to manage, maintain, extend and ultimately live in?

What if you could implement a data architecture that was fully integrated including management and monitoring? With the ability to not only monitor the end to end process but also the quality and lineage of the data itself.

A data architecture that could seamlessly ingest, store and optimally respond to both traditional transactional BI and newer Big Data workloads based on batch and real-time data.

What if this architecture was open standards compliant, able to support best in breed BI tooling in order to address advanced use cases such as predictive analytics.

Finally, what if this could be delivered either on premise or in the cloud.

Would you be interested in understanding what this data architecture looked like?

The answer is, yes (of course).

The Modern Data Platform

The Modern Data Platform delivers a foundation for agile modelling, delivery and iteration that rapidly shortens the time to deliver changes, as a result lowering the cost of ownership of your Business Intelligence and Analytics solutions.

Remember to follow the rest of the posts in this blog series where we will explore in detail the four remaining common challenges of traditional data warehousing.

Author:
Andy Steer

I have responsibility for defining and driving the evolution of the business in step with our customers and partners. I spend most of my time listening and learning from 3 sets of people; our customers, our partners and our employees to ensure that we are able to imagine, describe and deliver the innovative solutions that will ensure our customers future success. Before itelligence I worked in a variety of Consulting and Management roles in the field of Data Management and Business Intelligence for nearly 20 years and I regularly present at industry events in the UK and beyond. When not at work I spend most of my spare time either drinking wine with my Wife, playing Lego with my sons or getting depressed supporting Tottenham Hotspur.