Tags

At the recent Smarter Analytics Live 2013 forum in Melbourne, IBM senior consultant for enterprise content management solutions Adrian Barfield noted that fraud investigators often spend only 20 percent of their time actually doing the analysis work to uncover data wrong-doing. A full 80 percent of the effort goes toward figuring out what information to use and how to use it since today's data stream includes a diversity of information sources and types. Barfield says this has the effect of flipping the conventional model of security information processing upside-down -- you create the context for your investigation by sifting through large volumes of information.

Barfield cautions that "things are becoming more and more complicated" because analyzing structured data is a different task from analyzing unstructured data. Also, you often need to make a correlation between the two types of data: For example, reconciling structured activity logs with less structured help-desk logs or security incident reports. Security officers need a way to quickly identify patterns and build and deploy new security models.

One such tool to help organize this new security paradigm is the open source Apache UIMA project; Unstructured Information Management applications are software systems that analyze large volumes of unstructured information in order to discover knowledge that is relevant to the user. IBM developed UIMA to help make its Watson artificial intelligence platform a reality.

Another IBM tool that helps determine what data is relevant for a fraud investigation is the Intelligent Investigation Manager, a bundle of techniques that optimizes fraud investigation and analysis by dynamically coordinating and reporting on cases and analyzing and visualizing fraud within structured and unstructured data across silos. The component that bridges the gap between structured and unstructured data is Content Analytics with Enterprise Search:

It allows an investigator to cast a wide search net over a range of data types.

It provides the ability to make a fast, comprehensive analysis of disparate data types, enabling the user to classify data into usefulness categories. It also helps to identify emerging trends from mountains of data in order to start formulating a modus operandi for the investigation.