Introduction

This week I will continue series of posts to looking at data processing patterns used to build event triggered streaming applications, focusing on joining event streams. I'll cover some related use cases and how you would go about implementing within Wallaroo.

This purpose of these posts is to help you understand the data processing use cases that Wallaroo is best designed to handle and how you can go about building them.

I will be looking at the Wallaroo application builder, the part of your application that hooks into the Wallaroo framework.

Use Case

For the purpose of this post, I've simplified the use case and adapted the application builder code.

The simplified use case is as follows: an email promotion is sent to the individual who clicks on an ad if they have been identified as a loyal customer.

This use case requires two event streams. One that ingests records for identified loyal customers and saves them to a state object. The second ingests a stream of click data. When an identified loyal customer performed an incoming click, that ad click will trigger an email with the promotion.

This step is a stateful partition that calls a function save_loyal_customer. Since this is a partitioning step, the data for a specific customer would be routed automatically by Wallaroo to where the state object for that customer lives. The partition routing is executed via extract_customer_key.

This step makes use of the same stateful partition that was defined in the previous step, but calls a function check_loyal_click that will check to see if the customer who performed the click is indeed a loyal customer.

This is the way that you implement joining in Wallaroo, by having a computation in each pipeline that makes use of a shared state object. Each of these computations will interact with the state object and perform the required join logic.

ab.to_sink(wallaroo.TCPSinkConfig(out_host, out_port, cc_encoder))

In the last step, we will pass data out of Wallaroo for further processing. In this case, we will only pass along messages for loyal customers to be processed by an email server external to Wallaroo.

Conclusion

The joining streams pattern is used frequently when building streaming data applications and since Wallaroo allows you to implement any joining logic you require for the join, it is a very powerful model.

Give Wallaroo a Try

We hope that this post has piqued your interest in Wallaroo!

If you are just getting started, we recommend you try our Docker image, which allows you to get Wallaroo up and running in only a few minutes.