IBM today rolled out a tool it says can cull massive terabytes of data, including email -- to help customers detect external attacks aimed at stealing sensitive information or insider threats that might reveal corporate secrets.

The tool, called IBM Security Intelligence with Big Data, is built on top of two core IBM products: the IBM enterprise version of open-source Hadoop database with analytics tools known as InfoSphere BigInsights, plus the IBM QRadar security event and information management (SIEM) product that IBM obtained when it acquired Q1 Labs back in 2011.

At its heart, IBM Security Intelligence with Big Data -- IBM thinks 500 terabytes cluster size would be a likely starting point -- would collect and analyze data at high speed data that would include packet-capture data, security-event information from firewalls and other gear, and analyze a stream of content that might include anything from raw email to scrapped SharePoint content, among other business information. The idea is to pull from this voluminous stream the clues that indicate a company is under attack or has been compromised and how.

IBM's CTO Sandy Bird said the technology is most likely to first be adopted by large companies with data scientists on staff. He acknowledged there's still a lot to be learned about which analytical models and patterns will be the most successful in threat detection. IBM Security Intelligence with Big Data can be theory be applied to cloud-based services, but its starting point is likely to be deployment near the enterprise data center where massive amounts of data are the moist easily accessed for it to work.

The tool is already being deployed in some large corporations and governments. Mark Clancy, chief information security officer at financial firm Depository Trust & Clearing Corporation, said the bank is using IBM's technology to get real-time security awareness. "We need to move from a world where we 'farm' security data and alerts with various prevention and detection tools to a situation where we actively 'hunt' for cyber-attackers in our networks."

IBM is not alone in talking up big data as a critical tool for security threat detection in the coming years. RSA, the security division of EMC, recently disclosed it's getting into it, too, even betting the company's future on it, with a product announcement anticipated soon.

Gartner analyst Neil MacDonald said players to watch include IBM, HP and RSA, which all have traditional SIEM technologies and are developing analytics to take on the big data challenges around advanced threat detection.

"Gartner believes the information-security problem can really only be solved with big data services," said MacDonald, noting that the term "big data" applies here to situations where combining large volumes or velocity of data, often contextual, requires a new approach for the purposes of advanced threat detection.

MacDonald said this data might be a combination of reputational analysis, firewall logs, network packet data and more contextual information to determine if an attack or compromise has occurred. Today, larger organizations such as big banks and the Defense Department are seeking to do this mainly by building their own big data for security tools, he said. But buying rather than building complex tools like this is likely to prove attractive in the future, if not more cost effective.

It's all still considered emerging technology, but big data put into service for the purposes of security should evolve to be useful for small to midsize companies as well as the large ones, MacDonald urged. It's possible big data for security could also one day become more oriented as a service, he suggested. IBM's Bird said that may be possible eventually, but for now big data for security purposes is seeing its initial deployment in large organizations with mountains of sensitive information at stake.

For a deployment of IBM Security Intelligence with Big Data, the pricing would like look like this: QRadar is priced per appliance and by the quantity of data collected (events and network flows per second). BigInsights is priced by total storage capacity of the cluster. QRadar pricing starts below $50,000. BigInsights pricing starts below $50,000 for a 5TB storage system.