Engineering the information superhighway

Introducing Heka

We here on the Mozilla Services team are happy to announce our first beta release (v0.2b1) of Heka, a tool for high performance data gathering, analysis, monitoring, and reporting. Heka’s main component is hekad, a lightweight daemon program that can run on nearly any host machine which does the following:

Delivers any received or internally generated message data to an external location. Data might be written to a database, a time series db, a file system, or a network service, including an upstream hekad instance for further processing and/or aggregation.

Heka is written in Go, which has proven well-suited to building a data pipeline that is both flexible and fast; initial testing shows a single hekad instance is capable of receiving and routing over 10 gigabits per second of message data. We’ve also borrowed and extended some great ideas from Logstash and have built Heka as a plugin-based system. Developers can build custom Input, Decoder, Filter (i.e. data-processing), and Output plugins to extend functionality quickly and easily.

All four of the plugin types can be implemented in Go, but managing these plugins requires editing the config file and restarting and, if you’re introducing new plugins, even recompiling the hekad binary. Heka provides another option, however, by allowing for “Sandboxed Filters,” written in Lua instead of Go. They can be added to and removed from a running Heka instance without the need to edit the config or restart the server. Heka also provides some Lua APIs that Sandboxed Filters can use for managing a circular buffer of time-series data, and for generating ad-hoc graph reports (such as the following example) that will show up on Heka’s reporting dashboard:

Heka is a new technology. We’re running it in production in a few places inside Mozilla, but it’s still a bit rough around the edges. Like everything Mozilla produces, however, it’s open source, so we’re releasing early and often to make it available to interested developers (contributors / pull requests welcome!) and early adopters. Here’s a list of resources for those who’d like to learn more:

19 responses

As the founder of the HekaFS project (http://www.hekafs.org) I’m curious: what’s the derivation of the name? In our case it was an Egyptian god of magic, but most people latch on to the NoCal “hecka” meaning and that’s OK too.

@Jeff Darcy: Original inspiration for the name was a shortened version of “Hekatonhheires”, 100 armed giant from Greek mythology, due to the way our software can reach its tendrils in to all sorts of crevices to get to your data. But the Egyptian reference was not lost on us, and I *do* live in South Berkeley / North Oakland… 😉

@Royi: Heka at its core is storage-agnostic, but it can interact Graphite in a number of ways. We don’t have a huge corpus of plugins yet, but among the ones we do have are a) a StatsdOutput which can send messages to an upstream statsd/graphite rig, b) a StatsdInput which can accept statsd messages itself, c) a WhisperOutput that can write statsd / time series data out to graphite’s whisperdb format, ready to be exposed via graphite-web.

Also, alongside or (in some cases) as an alternative to Graphite, there is a circular buffer API that is exposed to the sandboxed Lua plugins. If you write Lua scripts and populate your buffers with time series data, then the DashboardOutput can automatically generate graphs such as the one you can see in the screen shot above.

@RDC: We’re certainly not setting out to compete with anyone in particular. And we’re not at the point where we’re trying to position a product, we’re mainly just focused on building for our own needs. There’s definitely some overlap in the problem space, however.

@mheim: Nope, sorry. Building out our set of plugins is one of our highest priorities. Contributors welcome!

@nduthoit: There is a lot of overlap w/ Logstash, but there are a few important (to us) differences. For one, decoding input data into Heka’s internal message format is an explicit step in the process (i.e. we have decoder plugins, Logstash doesn’t). Also, Heka being written in Go provides a number of advantages in terms of app design (Go’s concurrency primitives do, indeed, rock), performance (for some tasks we’re seeing Heka use an order of magnitude fewer system resources than Logstash), and debugging / deployment (Logstash only works on JRuby, and while it’s of course possible to stuff the whole JVM into an executable, Go’s runtime is leaner and less in our way).

Finally, Heka’s support for dynamic loading of filter plugins w/o the need to edit config files or restart the server is a major selling point. Lua’s VM is very very fast, and has great sandboxing support, so Heka can watch the amount of CPU and RAM that plugins are consuming and shut down any that start using too much. This makes it possible to let folks write their own data processing code and try it out on a running system w/ minimal risk to any critical processing / routing that might be happening in the same Heka instance.

Logstash is an excellent product, though, far more mature than Heka and with a much larger set of plugins. We have a great deal of respect for Logstash’s developers and community. Otherwise we wouldn’t have stolen so many of their ideas. 😉

@tk: Heka doesn’t support “authentication”, per se. It does support the use of signed messages, however, for verification of message origin, is this what you’re asking about? If so, it’s unfortunately not extremely well documented at the moment. The best place to start would be taking a look at the `sbmgr` command line utility that generates signed messages and sends them to Heka to send commands for adding and/or removing dynamic filter plugins. There’s a wee bit of documentation at http://hekad.readthedocs.org/en/latest/sandbox/manager.html, and if you’re intrepid you can figure out all of the gritty details by reading the source: https://github.com/mozilla-services/heka/tree/master/cmd/sbmgr . Hope this helps!

@Mateusz: Yes, in fact, we did take a look at Rust. And, were Rust farther along, we might have ended up using it, it would be a good choice. But Rust is still evolving a bit too rapidly for production software, and the tools aren’t as developed. I think Rust is an exciting project with a lot to offer, but it’s not quite ready for most uses.

This looks very useful. With a project like this, the power — and the challenge — is in how you configure the software to build a pipeline to transform data from a bunch of different sources into useful stuff on the backend.

It’d be great to include some “recipes” in the documentation, for how to put the inputs, filters, and outputs together. For example, parsing an nginx access_log to pull out status codes, transform them into counters per class (2xx, 3xx, etc.), and send them to a statsd server (or graph them in Heka?) The current fragments in the docs are helpful, but it’s hard to see how it all fits together.

Unfortunately, it’s a little bit too early for us to have them already. Some of our usage patterns within Heka are still in flux (see https://github.com/mozilla-services/heka/issues/166), and even without considering that we’re still working out the details of our own internal configurations. We’d like to have some real configs working for a while so we can iron out the wrinkles before we start recommending anything to other people.

Heka is currently ready for early-adopter types, people who want to get their hands dirty writing plugins or who are willing to dig in and poke around a bit to figure out how things work. If you’re in that camp and you’re having a hard time piecing things together, we’re happy to answer questions on the mailing list (see above link) or in the #heka channel on irc.mozilla.org. If you’re not in that camp, then it’s probably best to hang tight until a future release (“Heka: Now with more handholding!”).

If not, implementing that would make it a great tool and a replacement for graphite (as it is slow when the amount of data it handles becomes enormous). Also GoLang’s math library is written in assembly code. So implementing those functions here would be really good and the performance can be expected to be better than those implemented in python (as in graphite-web).

@boopathi: No, Heka doesn’t yet support the graphite functions listed in your link. In fact, Heka doesn’t even yet have a “front end” for stored time series data. Heka has a statsd input, and the stats that come in there could either be forwarded on to an upstream carbon server or, if you’d rather, Heka also has a whisper output, which will produce a directory structure full of whisperdb files that works with graphite-web.

Somewhat relatedly, Heka also has a “circular buffer” library, explicitly designed for holding transient time series data, that it exposes to the sandboxed Lua filters. If you use this library and you specify a DashboardOutput in your Heka config, then you can very easily generate arbitrary graphs from your Lua plugins. This dashboard is very rudimentary right now, however; it’s more of a proof-of-concept than a well-polished UI. And, while we plan on making it possible to generate graphs from the statsd data that comes in, we haven’t got that in there yet.

@stu: An SNMP input is on the (long) list of input plugins we’d like to have. Which ones get implemented first depends on Mozilla’s internal needs and the perceived demand from the community. You can impact the latter of those influences by opening tickets in the Heka project’s issue tracker on github (https://github.com/mozilla-services/heka/issues). And of course pull requests are always welcome.