Featured in Architecture & Design

Monal Daxini presents a blueprint for streaming data architectures and a review of desirable features of a streaming engine. He also talks about streaming application patterns and anti-patterns, and use cases and concrete examples using Apache Flink.

Featured in AI, ML & Data Engineering

Joy Gao talks about how database streaming is essential to WePay's infrastructure and the many functions that database streaming serves. She provides information on how the database streaming infrastructure was created & managed so that others can leverage their work to develop their own database streaming solutions. She goes over challenges faced with streaming peer-to-peer distributed databases.

The InfoQ eMag: Streaming Architecture

About the Author

InfoQ.com is facilitating the spread of knowledge and innovation in professional software development. InfoQ content is currently published in English, Chinese, Japanese and Brazilian Portuguese. With a readership base of over 1,400,000 unique visitors per month reading content from 100 locally-based editors across the globe, we continue to build localized communities.

The InfoQ Newsletter

A round-up of last week’s content on InfoQ sent out every Tuesday. Join a community of over 250,000 senior developers.
View an example

With the rise of technologies like Apache Kafka, Apache Beam and Spark Streaming, the topic of stream processing is becoming increasingly popular. Commercial businesses are being formed around the associated open source technology, conference talks are filled with stories of migrations from batch Extract-Transform-Load (ETL) to stream processing, and blog posts and online discussions debate important concepts like if it is really possible to implement exactly once processing (as shown in the second article, the answer is yes, with caveats).

This InfoQ eMag aims to cut through some of the hype, and introduce you to core stream processing concepts like the log, the dataflow model, and implementing fault-tolerant streaming systems.

Free download

Would you also like to receive...

MongoDB: The Definitive Guide to Backup and Recovery

This white paper provides insight to why MongoDB databases have become widely adopted, the requirement for backup and recovery for MongoDB and recommendations on how best to address this data protection challenge.

Sponsored by Rubrik

Yes, please bundle this white paper with the book.

Note: By checking the box you grant InfoQ permission to share your contact info with this sponsor.

Please choose

To receive this eMag
+ white paper bundle,
please answer the following questions:

Please provide your phone number with country code*

Please provide your zip code*

Streaming Architecture eMag includes:

Exploring the Fundamentals of Stream Processing with the Data flow Model and Apache Beam - Frances Perry and Tyler Akidau discuss the Google Data flow model and the practical implementation of this within the Apache Beam stream processing platform

Demystifying DynamoDB Streams: An Introduction to Ordering, Deduplication and Checkpointing - Akshat Vig and Khawaja Shams explore the implementation of Amazon DynamoDB Streams, and argue that understanding ordering and the effects of event duplication are vital for building distributed systems.

Is Batch ETL Dead, and is Apache Kafka the Future of Data Processing? - Neha Narkhede argues that traditional batch ETL is ineffective for solving the requirements of modern data processing, and instead Apache Kafka can be used to create a real-time stream processing platform.

Migrating Batch ETL to Stream Processing: A Netflix Case Study with Kafka and Flink - Shriya Arora presents a Netflix case study of a data processing migration to Apache Flink, and discusses that there are many decisions and tradeoffs that must be made when moving from batch ETL to stream processing.

When Streams Fail: Implementing a Resilient Apache Kafka Cluster at Goldman Sachs - Anton Gorshkov discusses how the Goldman Sachs platform team designed and operated a resilient on-premise Apache Kafka cluster, which is the foundation of their stream processing capabilities.