Mladen A. Vouk Collection

The ability to leverage diverse data types requires a robust and dynamic approach to systems design. The needs of a data scientist are as varied as the questions being explored. Compute systems have focused on the management and analysis of structured data as the driving force of analytics in business. As open source platforms have evolved, the ability to apply compute to unstructured information has exposed an array of platforms and tools available to the business and technical community. We have developed a platform that meets the needs of the analytics user requirements of both structured and unstructured data. This analytics workbench is based on acquisition, transformation, and analysis using open source tools such as Nutch, Tika, Elastic, Python, PostgreSQL, and Django to implement a cognitive compute environment that can handle widely diverse data, and can leverage the ever-expanding capabilities of infrastructure in order to provide intelligence augmentation.

Cyber-attacks and breaches are often detected too late to avoid damage. While "classical" reactive cyber defenses usually work only if we have some prior knowledge about the attack methods and "allowable" patterns, properly constructed redundancy-based anomaly detectors can be more robust and often able to detect even zero day attacks. They are a step toward an oracle that uses knowable behavior of a healthy system to identify abnormalities. In the world of Internet of Things (IoT), security, and anomalous behavior of sensors and other IoT components, will be orders of magnitude more difficult unless we make those elements security aware from the start. In this article we examine the ability of redundancy-based anomaly detectors to recognize some high-risk and difficult to detect attacks on web servers---a likely management interface for many IoT stand-alone elements. In real life, it has taken long, a number of years in some cases, to identify some of the vulnerabilities and related attacks. We discuss practical relevance of the approach in the context of providing high-assurance Web-services that may belong to autonomous IoT applications and devices.