Machine Learning-based anomaly detection in Azure Stream Analytics

Customers who monitor real-time data can now easily detect events or observations that do not conform to an expected pattern thanks to machine learning-based anomaly detection in Azure Stream Analytics, announced for private preview today.

Up to now, Industrial IoT customers, and others, who monitor streaming data relied on expensive custom machine learning models. Implementers needed to have intimate familiarity with the use case and the problem domain, and integrating these models with the stream processing mechanisms that required complex data pipeline engineering. The high barrier to entry precluded adoption of anomaly detection in streaming pipelines despite the associated value for many Industrial IoT sites.

Monitoring made easy

The new capability makes it quick and easy to do service monitoring by tracking KPIs over time, and usage monitoring through metrics such as number of searches, numbers of clicks, or performance monitoring through counters like memory, CPU, file reads, etc. over time. Customers no longer need to build complex and expensive anomaly detection models and integrate them with streaming pipelines.

The new functionality is targeted towards numerical time series data. Azure Stream Analytics can detect positive and negative trends, and changes in a dynamic range of values. For example, in IT monitoring scenarios where event data is streamed to Azure Stream Analytics, trend detection can be used to generate alerts for upward trends in memory usage since it may be indicative of a memory leak. Similarly, alerting on exceptions indicative of service health instability can be obtained by detecting changes in the dynamic range of values. Spikes in the number of login failures can be used to raise security alerts.

The power of Machine Learning

A simple function call in a declarative Azure Stream Analytics query can detect anomalies in the input data. The underlying general-purpose machine learning model is abstracted out and powers the function calls. The underlying machine learning detectors track changes in values and report ongoing changes in their values as anomaly scores. The general-purpose model does not require ad-hoc threshold tuning and uses continuous learning to learn over time. The function calls return anomaly scores and binary spike indicators for each point in time.

How to enable anomaly detection with declarative SQL

The following examples below highlight the productivity wins by enabling anomaly detection in a declarative SQL like query language to reason about data in motion.