Splunk Simplifies Big Data Analysis

Splunk introduces drag-and-drop data analysis and a new cloud option, but can its proprietary platform keep growing?

Would you rather choose a proprietary system that solves most of your challenges today, or would you place your bets on an open system like Hadoop, even if tools and techniques for that platform need to evolve?

Splunk is covering both bets with plans for a separate product designed to run on top of Hadoop. But on Tuesday, all attention was on the sixth major release of Splunk Enterprise, which has been deployed by more than 6,000 companies.

Splunk started out as a tool for IT, but it has steadily spread its wings to bigger and broader analyses. The upgrades in Splunk Enterprise 6, highlighted by a drag-and-drop data-exploration interface, are aimed at reaching an even wider audience of users.

"Splunk is trying to break out of the operational IT mold, where users are looking at right-here, right-now analyses of how servers and systems are operating," said Eric Hanselman, chief analyst at 451 Research, in an interview with InformationWeek. "Customers are ready to take a longer-term perspective, looking at how end users and customers are interacting with the company so they can be much more proactive with IT investments."

Splunk's sweet spot is machine data, a broad description that covers everything from clickstreams and server log files to point-of-sale and marketing automation systems. Most of these touch points have their own monitoring and charting capabilities, but Splunk's calling is applying analytics across machine data from multiple systems.

"You may be able to do charts and graphs in, say, a wireless management system, but you probably can't slice and dice that data and you likely won't have the depth of analyses available from Splunk," Hanselman said.

Splunk's proprietary SPL (Search Processing Language) is relatively simple for IT types, and it supports data searching, filtering, modification, manipulation, insertion and deletion. But query languages aren't right for business users. A Pivot interface introduced in Splunk Enterprise 6 is geared to non-technical users, providing click-and-drag data visualizations and filters. With the Enterprise 6 upgrade, Splunk's proprietary High Performance Analytics Store is said to deliver up to 1,000 times faster query performance than the previous version of the platform.

As for those companies counting on Hadoop as their big data backend, Splunk is also working on Hunk, the code name for Splunk Analytics for Hadoop. Announced in June and still in beta, Hunk promises to bring the company's SPL-based analytics capabilities to data residing in Hadoop clusters.

Hadoop users generally code MapReduce analyses from scratch or use the Hive SQL-like query interface, but Splunk touts Hunk as a shortcut to analysis. Splunk veterans will be able to use SPL, but the new Pivot drag-and-drop interface also will be added to Hunk, which is set for release by the end of the year.

In another move aimed at spreading access to Splunk, the company on Tuesday announced the general availability of Splunk Cloud, a new service that puts Splunk Enterprise software in the cloud. Splunk said the service will appeal particularly to companies that generate machine data in cloud-based environments, but it's also possible to correlate that data with information from on-premises systems.

Splunk Cloud offers a large-scale option that complements Splunk Storm, an entry-level cloud-based service introduced last year. Storm is being recast as a free test-and-development level service that is capped at 20 gigabytes of data per month. Splunk Cloud and Splunk Storm both run on Amazon Web Services.

Making decisions based on flashy macro trends while ignoring "little data" fundamentals is a recipe for failure. Also in the new, all-digital Blinded By Big Data issue of InformationWeek: How Coke Bottling's CIO manages mobile strategy. (Free registration required.)

Most IT teams have their conventional databases covered in terms of security and business continuity. But as we enter the era of big data, Hadoop, and NoSQL, protection schemes need to evolve. In fact, big data could drive the next big security strategy shift.

Why should big data be more difficult to secure? In a word, variety. But the business won’t wait to use it to predict customer behavior, find correlations across disparate data sources, predict fraud or financial risk, and more.