Splunk planning IT troubleshooting Wikipedia-like service

Summary:In August last year I wrote about Splunk, an enterprise startup that developed a search engine that captures IT data (logs, config files, message queues, SNMP, transactions, etc.) and classifying them into events, indexing them by time, keyword, type and relationships between events.

In August last year I wrote about Splunk, an enterprise startup that developed a search engine that captures IT data (logs, config files, message queues, SNMP, transactions, etc.) and classifying them into events, indexing them by time, keyword, type and relationships between events. Splunk has been seeding the market with a free, basic version (indexing up to 500 MB per day and no advanced, enterprise features) of the Professional edition. Now Baum plans to make the free version more “free.” Splunk has had an open source project (splunkforge.org) for extending the Splunk Server, but now plans to offer a GPL-based version of the Splunk Server by the middle of this year. “It’s the classic commercial open source strategy--entry level open source version and an enterprise version that mixes open and closed source,” Splunk CEO Michael Baum told me. “We’ve always thought about it, but we want to make sure we have the resources to support the community.”

In a parallel community effort, the company is beta testing Splunk Base, which lets users from different enterprise collaborate in a global troubleshooting community, sharing knowledge based on their “splunking.” Splunk creates event fingerprints of events as it indexes streams of data. “We use machine learning and figure out how to parse the data stream and pull out the stuff based on the fingerprinting technology we have built,” Baum said. “For event, we build a fingerprint and cluster common fingerprints so users can search on them. Those fingerprints end up being universal; for example, if you are using Oracle10, it’s the same for any users, so with a universal handle two different companies could talk Oracle problems even if they never met before.” Every fingerprint will have its own wiki page, with links to external informaiton and RSS feeds, on Splunk Base, according to Baum. “The community is building a wikipedia of IT information,” Baum said. “There are about 12,000 fingerprints today, but our goal is to grow Spunk Base, with a few hundred thousand users contributing." This is a case of what's good for Splunk's business is good for the community, if done right.

Like other companies that discovered the open source road after starting with closed source, such as Intalio, Splunk belatedly figured out that open sourcing is the better way to get broad distribution and to cultivate a community effect. Currently, Splunk has a few dozen paying customers, and about 3,000 using a trial version of the Professional edition (priced from $2,500 to $40,000 per server, depending on data volume), according to Baum. “There are a lot of people we aren’t reaching with the free, closed source product,” Baum told me. Open sourcing the core server code will attract more developers and administrators to try out the software, but the vast majority of the code will come from Splunk, as it does for MySQL and other major open source projects. Just as is has been for the free, closed version, the question will be which part of the hybrid product should remain proprietary and for how long. In the future, Splunk will be talking more about its support and advisory services and less about its per server licensing fees...