Featured in Architecture & Design

Monal Daxini presents a blueprint for streaming data architectures and a review of desirable features of a streaming engine. He also talks about streaming application patterns and anti-patterns, and use cases and concrete examples using Apache Flink.

Featured in AI, ML & Data Engineering

Joy Gao talks about how database streaming is essential to WePay's infrastructure and the many functions that database streaming serves. She provides information on how the database streaming infrastructure was created & managed so that others can leverage their work to develop their own database streaming solutions. She goes over challenges faced with streaming peer-to-peer distributed databases.

AWS Releases CloudTrail Processing Library

Amazon Web Services (AWS) recently released the AWS CloudTrail Processing Library (CPL), a "Java client library that makes it easy to build an application that reads and processes CloudTrail log files in a fault tolerant and highly scalable manner".

AWS CloudTrail records all API calls made in an AWS account for logging and auditing use cases including security analysis, change tracking, compliance aid and operational troubleshooting, as explained in more detail in our previous coverage. It has been introduced at re:Invent 2013 and expanded over the course of 2014 to support all public AWS regions and most services.

As usual, AWS provides an API for integrating CloudTrail with custom monitoring solutions. However, implementing the logic for processing CloudTrail events required interaction with at least three involved services Amazon S3, Amazon SNS and CloudTrail itself, while considering resiliency and fault tolerance – a cumbersome task.

This has now been addressed by a "new extension to the AWS SDK for Java":

The AWS CloudTrail Processing Library, or CPL, eliminates the need to write code that polls Amazon SQS queues, reads and parses queue messages, downloads CloudTrail log files, and parses and serializes events […]. Developers can read and process CloudTrail log files in as few as 10 lines of code. CPL handles transient and enduring failures […] in a resilient and fault tolerant manner. CPL is built to scale easily and can process an unlimited number of log files in parallel.

As illustrated by Jason Fulghum in his introductory post on the Java SDK blog, events can be filtered directly within this loop. More advanced use cases can be implemented by means of a few dedicated interfaces instead:

EventFilter – provides a callback to determine whether or not to process a log record