What is Stream Analytics?

In this article

Azure Stream Analytics is a fully managed event-processing engine that lets you set up real-time analytic computations on streaming data. The data can come from devices, sensors, web sites, social media feeds, applications, infrastructure systems, and more.

What can I do with Stream Analytics?

Use Stream Analytics to examine high volumes of data flowing from devices or processes, extract information from the data stream, and look for patterns, trends, and relationships. Based on what's in the data, you can then perform application tasks. For example, you might raise alerts, kick off automation workflows, feed information to a reporting tool such as Power BI, or store data for later investigation.

Analysis of data generated by sensors and actuators embedded in physical objects (Internet of Things, or IoT).

Web clickstream analytics.

Customer relationship management (CRM) applications, such as issuing alerts when customer experience within a time frame is degraded.

How does Stream Analytics work?

This diagram illustrates the Stream Analytics pipeline, showing how data is ingested, analyzed, and then sent for presentation or action.

Stream Analytics starts with a source of streaming data. The data can be ingested into Azure from a device using an Azure event hub or IoT hub. The data can also be pulled from a data store like Azure Blob Storage.

To examine the stream, you create a Stream Analytics job that specifies where the data is coming from. The job also specifies a transformation—how to look for data, patterns, or relationships. For this task, Stream Analytics supports a SQL-like query language that lets you filter, sort, aggregate, and join streaming data over a time period.

Finally, the job specifies an output to send the transformed data to. This lets you control what to do in response to the information you've analyzed. For example, in response to analysis, you might:

Send a command to change a device's settings.

Send data to a queue that's monitored by a process that takes action based on what it finds.

Job input can also include reference data (static or slow-changing data). You can join streaming data to this reference data to perform lookup operations the same way you would with database queries.

Route Stream Analytics job output in many directions. You can write to storage, such as Azure Storage blobs or tables, Azure SQL DB, Azure Data Lake Stores, or Azure Cosmos DB. From there, the data might go for batch analytics via Azure HDInsight. You might send the output to another service for consumption by another process, such as event hubs, Azure Service Bus topics, or queues. You might send the output to Power BI for visualization.

Ease of use

To define transformations, you use a simple, declarative Stream Analytics query language that lets you create sophisticated analyses with no programming. The query language takes streaming data as its input. You can then filter and sort the data, aggregate values, perform calculations, join data (within a stream or to reference data), and use geospatial functions. You can edit queries in the portal, using IntelliSense and syntax checking, and you can test queries using sample data that you can extract from the live stream.

Extensible query language

You can extend the capabilities of the query language by defining and invoking additional functions. You can define function calls in the Azure Machine Learning service to take advantage of Azure Machine Learning solutions. You can also integrate JavaScript user-defined functions (UDFs) in order to perform complex calculations as part a Stream Analytics query.

Scalability

Stream Analytics can handle up to 1 GB of incoming data per second. Integration with Azure Event Hubs and Azure IoT Hub allows jobs to ingest millions of events per second coming from connected devices, clickstreams, and log files, to name a few. Using the partition feature of event hubs, you can partition computations into logical steps, each with the ability to be further partitioned to increase scalability.

Low cost

As a cloud service, Stream Analytics is optimized to let you get going at low cost. You pay as you go based on streaming-unit usage and the amount of data processed by the system. Usage is derived based on the volume of events processed and the amount of compute power provisioned within the cluster to handle Stream Analytics jobs.

Reliability, quick recovery, and repeatability

As a managed service in the cloud, Stream Analytics helps prevent data loss and provides business continuity. If failures occur, the service provides built-in recovery capabilities. With the ability to internally maintain state, the service provides repeatable results ensuring it is possible to archive events and reapply processing in the future, always getting the same results. This enables you to go back in time and investigate computations when doing root-cause analysis, what-if analysis, and so on.