Contents

Integrating AMQP with Apache Spark

Introduction

This application shows how it’s possible to integrate AMQP based products with Apache Spark Streaming.
It uses the AMQP Spark Streaming connector, which is able to get messages from an AMQP source
and pushing them to the Spark engine as micro batches for real time analytics.
The other main point is that the used Apache Spark deployment isn’t in standalone mode
but running on OpenShift. Finally, an Apache Artemis instance is used as
AMQP source/destintation for exchanged messages.

The application consists of a publisher part which simulates temperature values
from a reading sensor and sending them to the temperature queue available on the broker via AMQP.
At same time, a Spark driver application uses the AMQP connector for reading the messages
from the above queue and pushing them to the Spark engine.
The driver application shows the max temperature value in the last 5 seconds.

Architecture

The scenario for using the AMQP - Spark Streaming connector can be described with following picture.

The main components which defines the architecture of the application are :

An Apache Spark cluster running on OpenShift

An Apache Artemis broker instance for the queue used for exchanging messages

The AMQP - Spark Streaming connector for reading that from the above queue
providing them to the Spark engine for stream processing

A simple client application which simulates a sensor and sends temperature values
to the queue using the AMQP protocol

Installation

The first step is building the source code from the repository as described at
following link.

After that, OpenShift is needed in order to run an Apache Spark as described
here.

Finally, the cluster deployment is made possible using the Oshinko web application as described
at this link.

Usage

Running the application is quite simple and it’s described in the running section
of the upstream project here.

Expansion

The user is free to change the application driver in order to apply different
Spark operations for gathering different insights (not only the max value) from
the stream of temperature values.