Kleppmann, author of the upcoming book Designing Data-Intensive Applications, claims that a lot of the ideas are quite simple and worth learning about, they can help us build applications more scalable, reliable and maintainable.

Kleppmann takes Google Analytics, a tool to keep track on pages viewed by visitors to a website, as an example of using events. With this tool every page view results in an event containing e.g. the URL of the page, a timestamp, and client IP address, which may result in a huge amount of events for a popular website. To collocate usage of the website from these events there are two options, both useful but in different scenarios:

Store all events in some type of data store and then use a query language to group by URL, time period and so on, essentially doing the aggregation when requested. An advantage using this technique is the ability to use new calculations on old data.

Aggregate on URL, time, etc. as the events arrives without storing the events, e.g. using an OLAP cube. One advantage this gives is the ability to make decisions in real time, for instance limiting the number of requests from a specific client.

Event sourcing is a similar idea coming from the Domain-Driven Design (DDD) community. A common example here is a shopping cart used in an e-commerce website. Instead of changing and storing current state of the cart, every event changing the state of the cart is stored. Examples of events are ItemAdded and ItemQuantityChanged. Replaying the events, or aggregating over them, will recreate the cart to its current state and Kleppmann notes that this is very similar to the Google Analytics example.

For Kleppmann using events is the ideal way of storing data, all information is saved by appending it as a single blob, removing the need to update several tables. He also sees aggregating data as an ideal way to read data from a data store since current state is normally what is interesting for a user. Comparing with a user interface, a user clicking a button corresponds to an event, and requesting a page corresponds to an aggregation to retrieve the current state. Kleppmann derives a pattern from his example; input raw events are immutable facts, easy to store and a source of truth. Aggregates are derived from these raw events, cached and updated when new events arrive. If needed all events can be replayed recreating all aggregates.

Moving to an event-sourcing-like approach is a move away from the traditional way databases are used, storing current state. Kleppmann’s reasons for still doing this includes:

Loose coupling, achieved by the separate schemas used for writing and reading.

Increased performance since separate schemas enables independent optimization of reads and writes and may also avoid (de)normalization discussions.

Flexibility from the possibility of trying a new algorithm for creating aggregates which then can either be discarded or replacing the old algorithm.

Error scenarios are easier to handle and reason about due to the possibility of replaying events leading to a failure.

Actor frameworks, e.g. Akka, Orleans and Erlang OTP are based on streams of immutable events but Kleppmann notes that they primarily are a mechanism dealing with concurrency, less for data management.

Beyond Metrics and Logging with Metered Software MemoriesA proposal for a different approach to application performance monitoring that is far more efficient, effective, extensible and eventual than traditional legacy approaches based on metrics and event logging. Instead of seeing logging and metrics as primary datasources for monitoring solutions we should instead see them as a form of human inquiry over some software execution behavior that is happening or has happened. With this is mind it becomes clear that logging and metrics do not serve as a complete, contextual and comprehensive representation of software execution behavior.

Software memories allow us to employ multiple techniques of discovery and they are not limited to what we know today and what tools and techniques are available to us at this time. If software machine (behavioral) memories can always be simulated with the ability to augment each playback of a simulation then there are no limits to what questions can be asked of the past in the present and future.

InfoQ Weekly Newsletter

Join a community of over 250 K senior developers by signing up for our newsletter. If you are based in the EEA, please contact us so we can provide you with the protections afforded to you under EEA protection laws.

Is your profile up-to-date? Please take a moment to review and update.

Email Address

Note: If updating/changing your email, a validation request will be sent

Company name:

Keep current company name

Update Company name to:

Company role:

Keep current company role

Update company role to:

Company size:

Keep current company Size

Update company size to:

Country/Zone:

Keep current country/zone

Update country/zone to:

State/Province/Region:

Keep current state/province/region

Update state/province/region to:

Subscribe to our newsletter?

Subscribe to our architect newsletter?

Subscribe to our industry email notices?

By subscribing to this email, we may send you content based on your previous topic interests. See our privacy notice for details.

You will be sent an email to validate the new email address. This pop-up will close itself in a few moments.

We notice you're using an ad blocker

We understand why you use ad blockers. However to keep InfoQ free we need your support. InfoQ will not provide your data to third parties without individual opt-in consent. We only work with advertisers relevant to our readers. Please consider whitelisting us.