Open source Graylog puts Splunk on notice

The real battle between the big data suites will be waged in ecosystem development, not platform specs

Thinkstock

Splunk, the log analysis system that's evolved into a full-blown, machine-generated data processing platform (also described as "Google for visual analytics"), faces competition from a rising wave of open source competitors. One of the most prominent, Graylog, has unveiled its formal 1.0 release. Graylog's success won't be in meeting or exceeding Splunk's feature set or performance, though; it'll be in capturing or re-creating Splunk's existing ecosystem of users and applications.

Graylog is written in Java and uses a few key open source technologies: Elasticsearch, MongoDB, and Apache Kafka. The first two ought to be familiar; the third is a messaging broker system that allows streamed data to be partitioned across a cluster of machines and has many big data analysis applications. Graylog 1.0 uses Kafka for the sake of better, "enterprise-ready" performance than its predecessors. Deployment recipes for Chef, Puppet, and Ansible are also included, and Graylog's overall architecture can be extended through a plug-in mechanism.

The biggest difference between Graylog and Splunk is the business model. Splunk is commercial software with a trial period, while Graylog is entirely open source, with various support contract offerings that start at $2,500 per year. In the short run, at least, Graylog, Inc. -- the company behind Graylog -- isn't too worried about cash flow, as it recently landed $2.5 million in financing.

Graylog's strategy, as explained in an email by company CEO Michael Sklar, is to "enable companies of all sizes to ingest all the data they need without worrying about financial constraints." He cited the history of IT as showing that "when a high-quality open source alternative for solving a real-world problems emerges, users become aware, and over time, prefer open source to proprietary software."

For many, Splunk's price tag is justified not by the program itself, but by all that comes with it: its galaxy of analytics and data visualization tools, for instance. Also, an array of applications work directly with Splunk, creating a software ecosystem -- maybe not as massive as Hadoop's, but with some of the same flavor. Graylog's plan doesn't revolve around offering immediate alternatives, but rather to foster an (open source) environment where substitutes can be developed in time.

Graylog isn't the only open source-based competition for Splunk, either. Logstash, another project also based on Elasticsearch, sports many of the same auxiliary features as Graylog: a plug-in architecture, Puppet and Docker deployment support, and so on. Logstash also received high marks from InfoWorld's Andrew Oliver because it can connect to a broad range of existing enterprise technology stacks. Yet another competitor -- albeit a commercial one -- is LucidWorks's SiLK, which makes use of Logstash along with Apache Flume, another open source log-management utility.