Featured in
Architecture & Design

Mini-talks: The Machine Intelligence Landscape: A Venture Capital Perspective by David Beyer. The future of global, trustless transactions on the largest graph: blockchain by Olaf Carlson-Wee. Algorithms for Anti-Money Laundering by Richard Minerich.

Featured in
Operations & Infrastructure

Mini-talks: The Machine Intelligence Landscape: A Venture Capital Perspective by David Beyer. The future of global, trustless transactions on the largest graph: blockchain by Olaf Carlson-Wee. Algorithms for Anti-Money Laundering by Richard Minerich.

Featured in
Enterprise Architecture

Mini-talks: The Machine Intelligence Landscape: A Venture Capital Perspective by David Beyer. The future of global, trustless transactions on the largest graph: blockchain by Olaf Carlson-Wee. Algorithms for Anti-Money Laundering by Richard Minerich.

Catching up with Esper: Event Stream Processing Framework

Esper (which version 1.0 was announced more than one year ago on InfoQ) is an event stream processing (ESP) and event correlation engine (CEP) that triggers actions when event conditions occurs among event streams - which can be thought of as a database turned upside down where statements are registered and data streams flow through. Event processing is a growing trend in the software industry, and several vendors have entered the market following a number of startups. Common use cases range from algorithmic trading, BAM, RFID, advanced monitoring systems and fraud detection up to a direct relationship with an SOA. InfoQ caught up with Thomas Bernhardt & Alexandre Vasseur on recent developments with the project.

According to the Esper team, Esper is currently the only pure Java open source ESP/CEP engine that is also commercially supported by a company named EsperTech - which also maintains a .Net implementation.

I think the fact that Esper plays a role in BEA's product helps the Esper project in a couple of ways. First, feedback gained is incorporated back into an improved Esper. Second, the BEA product raises overall awareness of CEP/ESP technology hugely and thus enlargens the mindshare and market. Third, its a great testimony to how open, extensible, and enterprise grade ready Esper technology is. The Esper community and users base is really proud of that relationship.

With the growth of this market space and the presence of multiple competing implementations, standardization could play some benefit. Thomas commented on the potential and background CEP language standardization:

The CEP community clearly sees CEP and ESP as complimentary, and recognizes that other approaches (i.e. baynesian or neural networks) also apply to CEP problems. In light of various approaches, and vendors not agreeing, the most relevant standard appears to emerge from work of the ANSI SQL standardization commitee extending SQL to provide "pattern matching in sequence of rows".

There will for sure be further work on that early topic and standardization will likely go beyond ESP/CEP language standardization.

Esper exceeds over 500 000 event/s on a dual CPU 2GHz Intel based hardware, with engine latency below 3 microseconds average (below 10us with more than 99% predictability) on a VWAP benchmark with 1000 statements registered in the system - this tops at 70 Mbit/s at 85% CPU usage.

Despite based on a rather simple use case, the publication of this benchmark work is aimed at shaking the industry, as it comes with a complete kit to replay the benchmark. An Esper event server is listening to remote clients sending market stock events over the network. The Esper engine is configured to compute volume weighted moving average of the feeds in real time over a sliding window of time or events.

Asked about the need for such a benchmark, Esper responded:

The CEP market has been common place of vague information regarding performance and latency with every vendor throwing its figures in the press without any details at all. No comparative benchmarks exists in this area yet.

Vague performance information in this industry had already been criticized by Progress Apama and others . Here is a compilation from the Apama blog:

* Skyler manages rates as high as 200,000 messages/second * Key feature: Coral8 handles thousands to millions of events per second * StreamBase extends performance leadership by processing over one million events per second with near zero latency * Aleri Labs breaks sub-millisecond latency barrier

Apama itself claims to be "a high-performance, scalable processing engine that can process thousands of events per second". Such claims could also be found in the BEA wording regarding their WebLogic Event Server announcement with inferior yet more precise figures: "As we come out of the gate, we're going to provide 50,000 complex events per second".

Those results seems to confirm that "hundreds of thousands" events per second is common and no exceptional in this area, and also show exactly how Esper performs on the given scenarios. It also gives valuable material to the user community to better assess performance instead of listening to random vendor FUD commonly throwned at disruptive yet affordable open source software.

The Esper team has also published the details of all its runs on its wiki and updated its product website with a performance section and performance best practices section. Another source of benchmarks may be coming from the newly formed STAC benchmark council, which aims to put out customer-driven benchmark standards for trading technology.

I believe Kdb+ (at kx.com by the way) is primarily a time series database.As such it does not have features for causality ("happened before" relation between events), and I expect it to be pretty hard to deal with "absence of events". By contrast Esper features both event stream (dealing with sliding windows of real time data stream) and complex events (happened before and other time guard constructs).The vwap benchmark that illustrates Esper performance is certainly a simple enough use case to compare Esper with a time series database, but it is far from giving a complete overview of what can be achieved with Esper event processing and continuous queries capabilities.

This said kdb+ is an interesting piece of software and I believe this should be fairly possible to integrate Esper with KX kdb+ for f.e. event replay capabilities.

unrelated to the kdb+ question above - i had a cause to use esper recently to measure application performance. On a current SOA project i was able to determine application latency and throughput by matching the correlation-ids being output by the various systems. adding a performance monitor with esper took about three hours from concept to finished implementation.

Just got back from the Gartner CEP event and did not hear about Esper there (not a big surprise, not much talk about open source in general there). This article is dated, so I'm wondering what is going on with Esper? Has it gained any momentum?

As far as benchmarks, it seems that with event processing, low latency and messages per second are stressed to the extreme. Performance in the micro-second range may be the cost of entry to play in the algorithmic trading space, but there are many, many more use cases for event processing where the events are measured in hundreds or thousands, not millions per second.

At the show, even many trading companies mentioned that performance was not their main criteria for picking a vendor ...