Technical details, ideas and news on data warehousing and big data from the Oracle Team

Big Data: Achieve the Impossible in Real-Time

Sure, we all want to make the impossible possible… in any scenario, in any business. Here we are talking about driving performance to levels previously considered impossible and doing so by using just data and advanced analytics.
An amazing example of this is the BMW Oracle Americas cup boat and its usage of sensor data and deep analytics (story here).

Consider these two quotes from the article:

"They were measuring an incredible number of parameters across the trimaran, collected 10 times per second, so there were vast amounts of [sensor] data available for analysis. An hour of sailing generates 90 million data points."

"[…] we could compare our performance from the first day of sailing to the very last day of sailing, with incremental improvements the whole way through. With data mining we could check data against the things we saw, and we could find things that weren't otherwise easily observable and findable."

The end result of all of this (and do read the entire article, it is truly amazing with things like data projected in sunglasses!) that the guys on the boat can make a sailboat to go THREE times as fast as the wind that propels the boat.

To make this magic happen, a couple of things had to be done:

Put the sensors in place and capture all the data

Organize the data and analyze all of it in real-time

Provide the decisions to the people who need it, exactly when they need it (like in the helmsman’s sunglasses!)

Convince the best sailors in the world to trust and use the analysis to drive the boat

Since this blog is not about sailing but about data warehousing, big data and other (only slightly) less cool things, the intent is to explain how you can deliver magic like this in your company?

Move your company onto the next value curve

The above example gives you an actual environment where the combination of high volume, high velocity sensor data, deep analytics and real-time decisions are used to drive performance. This example is a real big data story.

Sure, a multi-billion dollar business will collect often more data, but the point of the above story is analyzing a previously unseen, massive influx of data – the team estimated 40x more data than in conventional environments. However, the extra interesting aspect is that decisions are automated. Rather than flooding the sunglasses with data, only relevant decisions and data are projected. No need for the helmsman to interpret the data, he needed to simply act on the decision points.

To project the idea of acting on decision points into an organization, your IT will have to start changing, as will your end users. To do so, you need to jump onto the bandwagon called big data. The following describes how to get on that bandwagon.

Today, your organization is doing the best it can by leveraging its current IT and DW platforms. That means – for most organizations – that you have squeezed all the relevant information out of the historical data assets you analyze. You are the dot on the lower value curve and you are on the plateau. Any extra dollar invested in the plateau is just about keeping the lights on, not about generating competitive advantage or business value. To jump to the next curve, you need to find some way to harness the challenges imposed by big data.

From an infrastructure perspective, you must design a big data platform. That big data platform is a fundamental part of your IT infrastructure if your company wants to compete over the next few years.

The main components in the big data platform provide:

Deep Analytics – a fully parallel, extensive and extensible toolbox full of advanced and novel statistical and data mining capabilities

High Agility – the ability to create temporary analytics environments in an end-user driven, yet secure and scalable environment to deliver new and novel insights to the operational business

Low Latency – the ability to instantly act based on these advanced analytics in your operational, production environments

Read between the lines and you see that the big data platform is based on the three hottest topics in the industry: security, cloud computing and big data, all working in conjunction to deliver the next generation big data computing platform.

Over the next couple of years, companies which drive efficiency, agility and IT as a service via the cloud, which drive new initiatives and top line growth leveraging big data and analytics, keep all their data safe and secure, will be the leaders in their industry.

Oracle is building the next generation big data platforms on these three pillars: cloud, security and big data. Over the next couple of months – leading up to Oracle OpenWorld – we will cover details about Oracle’s analytical platform and in-memory computing for real-time big data (and general purpose speed!) on this blog.

A little bit of homework to prepare you for those topics is required. If you have not yet read the following, do give them a go, they are a good read:

These - older - blog posts will get you an understanding of in-database mapreduce techniques, how to integrate with Hadoop and a peak at some futuristic applications that I think would be generally cool and surely be coming down the pipeline in some form or fashion.