Streaming data: The ins and outs of this technology buzzword

By Steve Putman, Principal Data Management Consultant, SAS

Imagine a contestant on the TV game show Jeopardy! who chooses the category “Current Technology Buzzwords for $800.” The game show host then reads the clue to the contestant: “A synonym for the phrase data in motion." The contestant’s answer (which must be stated in the form of a question) is: “What is streaming data?”

As we explore this topic, a few other questions naturally follow:

How do big data and event stream processing (ESP) relate to streaming data?

What are the data management implications of streaming data?

Will streaming data affect my business (in other words, why should I care)?

What is streaming data?

Streaming data, or data in motion, implies data that is not “at rest” like data that has been captured and stored in a database. Until recently, streaming data was only used in specialized fields – pharmaceutical research, energy production, capital markets and others that had millions of transactions constantly emanating from equipment or devices.

Data streams are closely tied to event stream processing software, a key way to process high-volume data. With today’s torrent of data spewing from countless sources, ESP techniques are being applied to many new areas to uncover insights – whether it’s from social media streams, consumer electronics or other technologies that didn’t even exist five years ago.

Big data and event stream processing

Depending on your perspective, event streams can be considered big data, which is characterized by three well-known “V’s” – volume, variety and velocity. The formats of event data are as varied as the devices and communication protocols themselves – including text data and transmissions from both legacy and modern equipment sensors. For a single device, the data format is known (which invalidates the “variety” requirement, at least for isolated device transmission). But in general practice, you have to normalize multiple formats when applications process more than one stream of data to answer questions. The other two V’s apply as well – because event stream volumes and transmission speeds are in the hundreds of thousands to millions of events per second.

One of the biggest things to happen with ESP in recent history is the advent of the Internet of Things (IoT). Soon, there will be millions of objects like cars, appliances and gadgets that will produce data about operations, activities and behaviors that never existed before. Your business may be able to take advantage of these new data streams to reduce operating costs, improve product reliability or optimize usage models – just to give a few examples. With the IoT spurring increased uses of ESP, many businesses that have never considered ESP a viable technology will soon need to reconsider that choice.

Data management and streaming data

Managing streaming data is fundamentally different from managing most other types of transactional data, because much of the value in streaming data is in the aggregated measurements, not the individual transactions. In most applications, a single event doesn’t need to be saved in the enterprise’s infrastructure. That’s because ESP performs pattern detection – and when patterns are as expected, there is no additional value to retaining what is already known to be the norm.

Within data streams, however, data management functions such as normalization, cleansing and standardization do occur for individual events that are processed in-memory. So data can be corrected and assessed, even for complex pattern analysis, before the data is stored. ESP also makes it possible to make decisions from data while it’s still in motion, filtering down the volumes to just the data that should be stored.

One way to effectively use data management technology in ESP is to monitor failing equipment based on its data output. Generally, failing equipment produces data gaps and incorrectly formatted data prior to complete failure. Consequently, data quality performed on an incoming data stream can help detect these anomalies before the equipment totally fails. Continuous pattern detection that predicts both what and when equipment is likely to fail helps organizations preplan for necessary parts and personnel.

Successfully handling increased data volumes means you’ll need to have efficient, effective policies to support your business goals. To do this, you’ll want to be able to properly define and apply business rules to the advanced analytical routines of data streams for ESP. Used in conjunction with appropriate rules, streaming data can deliver immediate insights to help you do things like boost efficiencies, reduce costs and identify new opportunities.

Will my business be affected by streaming data?

As a technology manager, it will pay to stay abreast of developments with streaming data, ESP and IoT. This segment of the technology industry has all the earmarks of being a hotbed for innovation as individuals and businesses rapidly devise new uses for streaming data – even in industries that have not traditionally used streaming data or ESP.

Managing event streams and gathering useful business intelligence from streaming data requires industry-leading technology. Make sure you can do more than just answer the $800 technology buzzwords question. Learn how SAS’ data management software expertise can help businesses anticipate and take charge of the new opportunities that hinge on the explosion of streaming data.

Steve Putman has more than 25 years of experience supporting client/server and Internet-based operations from small offices to major corporations in both functional and technical roles. He is a thought leader in the fields of data governance, master data management, big data and semantic technologies.